The interactions between the effects of implicit and...

22
The Interactions Between the Effects of Implicit and Explicit Feedback and Individual Differences in Language Analytic Ability and Working Memory SHAOFENG LI University of Auckland Department of Applied Language Studies and Linguistics Auckland 1142 New Zealand Email: [email protected] This study investigated the interactions between two types of feedback (implicit vs. explicit) and two aptitude components (language analytic ability and working memory) in second language Chinese learning. Seventy-eight L2 Chinese learners from two large U.S. universities were assigned to three dyadic NS–NNS interaction conditions and received implicit (recasts), explicit (metalinguistic correction), or no feedback (control) in response to their non-target-like oral production of Chinese classifiers. The treatment effects were measured by a grammaticality judgment test and an elicited imitation test. The Words in Sentences subtest of the MLAT was used to measure language analytic ability; a listening span test was utilized as the measure of working memory. A principal components analysis and a structural equation modeling analysis established that working memory was an aptitude component. Multiple regression analyses showed that language analytic ability was predictive of the effects of implicit feedback, and working memory mediated the effects of explicit feedback; all the statistically significant results involved delayed posttest scores. Interpretations were sought with recourse to the mechanisms of the cognitive constructs and the processing demands imposed by the different learning conditions. Keywords: corrective feedback; aptitude; language analytic ability; working memory CORRECTIVE FEEDBACK HAS BEEN A MAJOR theme in recent SLA research because of the important place it occupies in pedagogy and theory construction. Practitioners are faced with the conundrums of whether learnerserrors should be responded to and if so, when and how feedback should be provided to achieve optimal instructional effects. Researchers are interested in the cognitive and social dimensions of feedback as a theoretical construct that facilitates or impedes interlanguage development (see Krashen, 1981). Empirical investigation into corrective feedback has been fruitful in probing the principles and processes of SLA and providing valuable pedagogical implications. However, there has been scant attention to the role played by individual differences in learnersprocessing of corrective feedback and the learning that results. Language aptitude is an extensively studied individual difference variable. However, most aptitude research has been predictive in nature, and the primary objective of aptitude testing has been to determine learnerspotential to achieve ultimate L2 success. The results of aptitude tests have been mainly used to select elite learners for language courses or programs (which are often government-funded), diagnose learning disabil- ities, or place students into classes of appropriate The Modern Language Journal, 97, 3, (2013) DOI: 10.1111/j.1540-4781.2013.12030.x 0026-7902/13/634–654 $1.50/0 © 2013 The Modern Language Journal

Transcript of The interactions between the effects of implicit and...

The Interactions Between the Effectsof Implicit and Explicit Feedback andIndividual Differences in LanguageAnalytic Ability and Working MemorySHAOFENG LIUniversity of AucklandDepartment of Applied Language Studies andLinguisticsAuckland 1142New ZealandEmail: [email protected]

This study investigated the interactions between two types of feedback (implicit vs. explicit) and twoaptitude components (language analytic ability and working memory) in second language Chineselearning. Seventy-eight L2 Chinese learners from two largeU.S. universities were assigned to three dyadicNS–NNS interaction conditions and received implicit (recasts), explicit (metalinguistic correction), orno feedback (control) in response to their non-target-like oral production of Chinese classifiers. Thetreatment effects were measured by a grammaticality judgment test and an elicited imitation test. TheWords in Sentences subtest of the MLAT was used to measure language analytic ability; a listening spantest was utilized as the measure of working memory. A principal components analysis and a structuralequation modeling analysis established that working memory was an aptitude component. Multipleregression analyses showed that language analytic ability was predictive of the effects of implicit feedback,and working memory mediated the effects of explicit feedback; all the statistically significant resultsinvolved delayed posttest scores. Interpretations were sought with recourse to the mechanisms of thecognitive constructs and the processing demands imposed by the different learning conditions.

Keywords: corrective feedback; aptitude; language analytic ability; working memory

CORRECTIVE FEEDBACK HAS BEEN A MAJORtheme in recent SLA research because of theimportant place it occupies in pedagogy andtheory construction. Practitioners are faced withthe conundrums of whether learners’ errorsshould be responded to and if so, when andhow feedback should be provided to achieveoptimal instructional effects. Researchers areinterested in the cognitive and social dimensionsof feedback as a theoretical construct thatfacilitates or impedes interlanguage development

(see Krashen, 1981). Empirical investigation intocorrective feedback has been fruitful in probingthe principles and processes of SLA and providingvaluable pedagogical implications.However, therehas been scant attention to the role played byindividual differences in learners’ processing ofcorrective feedback and the learning that results.

Language aptitude is an extensively studiedindividual difference variable. However, mostaptitude research has been predictive in nature,and the primary objective of aptitude testing hasbeen to determine learners’ potential to achieveultimate L2 success. The results of aptitude testshave been mainly used to select elite learners forlanguage courses or programs (which are oftengovernment-funded), diagnose learning disabil-ities, or place students into classes of appropriate

The Modern Language Journal, 97, 3, (2013)DOI: 10.1111/j.1540-4781.2013.12030.x0026-7902/13/634–654 $1.50/0© 2013 The Modern Language Journal

levels (Carroll & Sapon, 2002). Recently, it hasbeen proposed that the boundaries of aptituderesearch should be expanded to examine howaptitude, or rather different configurations ofaptitude components, interacts with differentlearning conditions (Robinson, 2005) or differentstages of L2 development (Skehan, 2012). Re-search on the interaction between instructionaltreatment and language aptitude can potentiallyprovide insights into the mechanisms of SLA andsuggest pedagogical implications. Unfortunately,to date there has been a dearth of such research.As Spada (2011) observed, “There is a clear needfor more research exploring relationships be-tween aptitude (and other individual differen-ces), type of instructional approach and SLA”(p. 232). This study is undertaken to investigatethe interactions between the effects of two types offeedback (recasts and metalinguistic correction)and learners’ aptitude differences in languageanalytic ability and working memory in thelearning of Chinese classifiers.

RECASTS AND METALINGUISTIC FEEDBACK

A recast is a reformulation of a non-target-likeL2 utterance. Among all the identified correctivestrategies, recasts are probably the most studied.Recasts have been found to be the most frequentfeedback type in all instructional settings; theirpopularity is most likely due to their contingency,nonintrusiveness, and affordance of both positiveand negative evidence. These characteristicsmake recasts an ideal form-focusing strategy inmeaning-oriented communication. Lyster andMori (2006) contended that recasts are usefulnot only as a form-focusing device but also as ascaffolding tool when the content or knowledgerequired for the maintenance of the ongoinginteraction is beyond the learner’s capacity.

Research shows mixed results regarding theeffects of recasts. In general, recasts have beenshown to be effective in laboratory studies (Egi,2007; Ishida, 2004; Iwashita, 2003; Leeman, 2003;Lyster & Izquierdo, 2009; Mackey & Philp, 1998;McDonough, 2007; Sagarra, 2007). These studiesare typically carried out in dyadic interaction (orvia the computer) where learners receive inten-sive recasts on a single structure. Methodologicalfeatures such as the lab setting, provision offeedback on a one-on-one basis, and targeting asingle structure might make recasts relativelysalient and therefore beneficial to L2 develop-ment. This speculation has been confirmed bysome studies that showed small or no effects forrecasts in classroom settings (Ellis, Loewen, &

Erlam, 2006; Lyster, 2004; Sheen, 2010; Yang &Lyster, 2010). However, recasts were found to bevery effective in classroom settings where their usewas salient or intensive. In Doughty and Varela’s(2002) study, a non-target-like utterance wasrepeated with a rising tone followed by a recast,which made the corrective intention easilyrecognizable. In Han’s (2002) study, the learnersreceived recasts in 11 sessions on their non-target-like production of English past tenses. The effectsof recasts may also be mediated by the targetstructure. For example, Ammar and Spada (2006)found that recasts were effective for possessive his/her in English, a salient, transparent structure,whereas redundant and/or opaque structuressuch as French gender (Lyster, 2004), Englishpast tense (Ellis et al., 2006; Yang & Lyster, 2010),and English articles (Sheen, 2007) may not beamenable to such feedback.

One corrective strategy that is often juxtaposedwith recasts in feedback research is metalinguisticfeedback, which is defined as “comments, infor-mation, or questions related to the well-formed-ness of the student’s utterance” (Lyster & Ranta,1997, p. 47). Sheen (2011) made a distinctionbetween direct and indirect metalinguistic feed-back. The former, also called metalinguisticcorrection, contains the correct form and ametalinguistic comment; the latter only includesa metalinguistic comment. Sheen (2011) con-tended that direct metalinguistic feedback con-tains both positive and negative evidence,facilitates learner awareness at the level ofunderstanding (rather than mere noticing), andis therefore especially useful in learning complexlinguistic structures. This study investigates whatis called direct metalinguistic feedback in Sheen’sterminology and is referred to as metalinguisticcorrection.

Recasts are often investigated in comparisonwith metalinguistic feedback partly because theformer constitutes an implicit type of feedbackand the latter is representative of explicitfeedback. However, many would disagree withthe implicit–explicit dichotomy on the groundsthat recasts can become explicit in situationswhere learners easily recognize the correctiveforce even if they are intended to be implicit. Todispel the controversy surrounding the distinc-tion between implicit and explicit feedback, it isimportant to distinguish instruction from learn-ing. Hulstijn (2005) stated that “instruction isexplicit or implicit when learners do or do notreceive information concerning rules underlyingthe input, respectively” (p. 132). FollowingHulstijn’s view, then, the implicitness or

Shaofeng Li 635

explicitness of feedback should be determinedfrom the teacher’s perspective. Recasts do notprovide rule explanation and are thereforeimplicit; metalinguistic feedback often subsumesrule explanation, so it is explicit. Whereas thenature of feedback concerns the provider, thenature of learning resulting from feedback relatesto the receiver of the feedback. Learning isexplicit if it is conscious and involves ruleprocessing or induction; learning is implicit if ithappens unconsciously and does not implicaterule processing or induction (Hustijn, 2005;Robinson, 1997, 2002). While implicit feedbackis more likely to lead to implicit learning thanexplicit feedback, it may also contribute toexplicit learning when the learner starts to inferand formulate rules based on available linguisticdata (e.g., Long, Inagaki, & Ortega, 1998). By thesame token, explicit feedback may invite implicitlearning, such as when the learner picks upsomething unconsciously that is not related to thelinguistic structure the feedback targets.

Previous research showed that recasts were lesseffective than metalinguistic feedback (Ellis,Loewen, & Erlam, 2006; Lyster, 2004; Sheen,2010), which provides further support for Spadaand Tomita’s (2010) and Norris and Ortega’s(2000) meta-analytic finding that explicit instruc-tion was more effective than implicit instruction.However, the meta-analyses by Li (2010) andMackey and Goo (2007) showed larger long-termeffects for implicit feedback. There is alsoevidence that implicit feedback and explicitfeedback were equally effective for more ad-vanced learners but lower-level learners benefitedmore from explicit feedback, suggesting animpact of proficiency on the effects of feedback(Ammar & Spada, 2006; Li, 2009). It would seemthat the unequivocal advantage of explicit feed-back/instruction needs to be reconsidered andthat the investigation of the mediating variablesfor corrective feedback should be prioritized inthe new agenda for feedback research. One suchvariable is language aptitude.

LANGUAGE APTITUDE AND CORRECTIVEFEEDBACK

Language aptitude refers to an individual’scapacity to learn a language. The publication ofthe MLAT (Modern Language Aptitude Test;Carroll & Sapon, 1959) led to an exponentialgrowth in research on language aptitude becauseit provided a clear definition and valid measure ofthe construct. The MLAT has been so influentialthat, to some extent, aptitude is defined in terms

of what the MLAT measures. The MLAT consistsof five parts that measure three dimensions ofaptitude: phonetic coding ability, language ana-lytic ability, and rote learning (memory) ability.Based on the findings of the MLAT studies,Carroll and Sapon (2002) claimed that languageaptitude is (a) distinct from IQ, (b) stable andunsusceptible to training, (c) separate fromaffective variables (motivation, anxiety, etc.),(d) prognostic of learning rate (and diagnosticof learning problems), and (e) not affected bydifferences in instructional context, target lan-guage, or language skill (e.g., written vs. oral).

The MLAT was developed in the heyday of theaudiolingual approach (the 50s and 60s), whichwas characterized by drills, rote learning, and thedevelopment of explicit grammar knowledge.The audiolingual approach was grounded inBehaviorism, according to which language learn-ing was amatter of habit formation. Subsequently,Behaviorism lost ground to Krashen’s (1981)Universal Grammar-basedMonitor Theory, whichfeatured exposure to input and implicit learning.In the meantime, the audiolingual approach gaveway to communicative approaches, where apti-tude was criticized as being irrelevant. Also,because of the dominance of Krashen’s theory(the Affective Filter Hypothesis in particular),variables other than aptitude, such as motivationand anxiety, became the focus of research onindividual differences.

However, several more recent theories (such asthe Interaction Hypothesis (Long, 1996) and theNoticing Hypothesis (Schmidt, 1990)) havechallenged Krashen’s theories and generatedprolific empirical research. The 1990s witnesseda resurrection of aptitude research, which isascribable to the claims of the newly borntheories. For instance, empirical studies guidedby some of the above theoretical frameworksdemonstrated that there is a need for a certainamount of attention to form in meaning-primaryL2 instruction. This made it possible for aptitudeto reclaim part of its lost territory in SLA, becauseof its inherent relevance to form-focused instruc-tion. Also, these theories have spawned a sweep-ing interest in the cognitive processes of SLA,leading to the study of aptitude components suchas analytic ability and working memory as cogni-tive variables central to L2 acquisition.

In the meantime, aptitude researchers realizedthe limitation of investigating only the variable ofaptitude as a predictor of proficiency. It hasbecome clear that the predictive function ofaptitude is of limited theoretical value. While it istrue that the results of aptitude tests may serve

636 The Modern Language Journal 97 (2013)

some pedagogical purposes such as selection oftalented language learners, diagnosis of learningdisabilities, and prognosis of learning rate, theseuses of aptitude are mostly tangential to explain-ing SLA. Robinson (1997, 2002) pointed out thataptitude should be treated as a dynamic constructthat interacts with different learning and instruc-tional conditions that impose different cognitivedemands on learners.

The interaction between language aptitude andcorrective feedback is a promising researchvenue. Both domains are relatively mature: Therehas been a large body of research on bothcorrective feedback and language aptitude tothe effect that the constructs are clearly definedand operationalized and reliable research meth-ods, such as measures and data elicitation tasks,have been developed. Integrating the two areashas the potential of obtaining findings that lead tomore precise understanding of SLA processes.There is also a mutual need for investigating thetwo areas in combination. Feedback research hasreached a point where new variables must beintroduced in order to obtain more accurateinterpretations of the effects of different correc-tive strategies. Aptitude research has been inhibernation for a long period of time and itsrevival is in large part dependent on the extent towhich its primary objective evolves from apredictive to an explanatory role.

To revitalize aptitude research, it is also criticalto update aptitude measures incorporating thelatest research findings. For instance, someresearchers have argued for the utilization ofworking memory tests rather than tests ofassociative memory to capture the learningmechanism of form-focused instruction—a briefattention to form during meaningful communi-cation (DeKeyser & Koeth, 2011; Robinson,2002). Despite a large body of research into therelationship between working memory and SLAprocesses (Skehan, 2012), there has been a lack ofresearch on working memory as an aptitudecomponent. The following review starts with anoverview of the two aptitude components underinvestigation—language analytic ability and work-ing memory—followed by a summary of theresearch that has investigated the interfacebetween feedback and the two cognitive variables.

Language Analytic Ability

Carroll (1981) defined language analytic ability(grammatical sensitivity) as “the ability to recognizethe grammatical functions of words (or otherlinguistic entities) in sentence structures” (p. 105).

Language analytic ability is oftenmeasuredwith theWords in Sentences subtest of the MLAT. Previousresearch showed that, among the three compo-nents included in theMLAT (phonemic encoding,analytic, and memory), language analytic ability isthe most predictive of L2 proficiency (Ehrman &Oxford, 1995; Hummel, 2009; Ranta, 2002). Lan-guage analytic ability has been found to be relatedto general L2 proficiency (Alderson, Clapham, &Steel, 1997; Sparks et al., 2011) and the acquisitionof explicit knowledge (Gardner & Lambert, 1965;Horwitz, 1980); whether it is predictive of theacquisition of implicit knowledge is less certain.While it was found to be correlated with oralproduction in some studies (Ehrman & Oxford,1995; Horwitz, 1987), there were no such correla-tions in other studies (Harley & Hart, 1997; Ranta,2005).

A few studies have examined how languageanalytic ability interacted with the effectiveness ofcorrective feedback. DeKeyser’s (1993) longitudi-nal classroom study included two L2 Frenchclasses: One received explicit error correction andthe other did not. After a school year, thefeedback group did not outperform the compari-son group; there was no effect for languageanalytic ability. Sheen (2007) investigated theextent to which language analytic ability correlat-ed with the effects of recasts and metalinguisticcorrection in the learning of two uses of Englishindefinite and definite articles, a as first mentionand the as anaphoric reference. Language analyticability was found to be correlated with the effectsof metalinguistic correction but not those ofrecasts. Trofimovich, Ammar, & Gatbonton(2007) examined the relationship between theeffects of computerized recasts and languageanalytic ability (as well as attention control andworking memory). Significant relationships werefound between language analytic ability and theeffects of the feedback in learning grammaticalitems (his/her in English) but not lexical items.Taken together, these few studies seemed toindicate that language analytic ability was relatedto the effects of metalinguistic feedback, and tothe effects of recasts in the learning of an easygrammatical structure (his/her) but not a difficultone (the/a); it was not sensitive to the learning oflexicon. However, given the small number ofstudies, it is premature to draw any conclusions.

Working Memory

The term working memory has been adopted forshort-termmemory to reflect the fact that, insteadof being merely a warehouse to store incoming

Shaofeng Li 637

data, it is also responsible for informationprocessing (Miyake & Friedman, 1998). Thereare two views on the architecture of the workingmemory construct (Conway et al., 2007; French,2006): the unitary approach and the multicom-ponential or multifaceted approach. Researchersembracing the unitary approach believe thatworking memory is a single construct thatperforms both storage and processing functions(Daneman & Carpenter, 1980). Others hold thatworking memory consists of a central executiveand several slave systems (e.g., Baddeley, 2007).The central executive is responsible for thecontrol and regulation of the working memorysystem. The subcomponents include a phonolog-ical loop that stores phonological/auditory infor-mation, a visuospatial sketchpad that involves thegeneration and storage of visual information, andan episodic buffer that integrates informationfrom a variety of systems and from long-termmemory.

In L2 research, there has been a call toinvestigate working memory as an aptitude com-ponent (Miyake & Friedman, 1998; Robinson,2005). Robinson argued that the MLAT and othertest batteries of aptitude were developed inaudiolingual contexts where rote learning was adefining feature. However, in communicativelanguage teaching, linguistic forms are addressedin meaning-focused instruction, and the process-ing demands of this type of instruction aredifferent from those of audiolingual classes.Skehan (1982) also argued that the MLAT subtestthat is concerned with the memory component ofaptitude (asking the learner to memorize someartificial words and then recognize them) meas-ures learners’ associative memory, which may notbe the most predictive of language learning.Indeed, working memory has the potential ofbeing the most important aptitude componentbecause it constitutes a converging space for allthree aptitude components—phonetic coding,language analytic ability, and memory. However,to date, few studies have focused on the validationof workingmemory as an aptitude component andits role in SLA.

With respect to the relationship betweenworking memory and the effectiveness of correc-tive feedback, there have been several publishedstudies, all of which relate to recasts. Mackey et al.(2002) investigated the relationship amongworking memory, noticing of recasts, and theeffects of recasts in the learning of Englishquestion formation by Japanese EFL learners.The researchers found a positive correlationbetween working memory and noticing. They

also found that learners with low workingmemoryshowed more improvement on the immediateposttest and those with high working memorydemonstrated more interlanguage developmenton the delayed posttest. Mackey and Sachs (2012)investigated the relationship between workingmemory and interactional feedback (mainlyrecasts) with 9 older adult ESL learners (ages65–89). The participants who improved theiraccuracy in producing the target structure(question formation) were those with the highestworking memory scores. Revesz’s (2012) studyconcerns the extent to which the effects of recastsare related to scores of memory in the learning ofEnglish past progressive construction. Significantcorrelations were foundbetweenworkingmemoryand written test scores and between phonologicalshort-term memory and oral test scores. Trofi-movich, Ammar, and Gatbonton (2007) exam-ined the role of memory (together with attentionand analytic ability, as reviewed above) inmediating the effects of computerized recasts.It was found that working memory was not asignificant predictor of the learners’ interlan-guage development. However, in a similar study,Sagarra (2007) found that the effects of recastswere associated with the learners’ working mem-ory capacities.

Many issues remain unresolved. The studies byMackey and her colleagues revealed some inter-esting and thought-provoking findings, but thesefindings need to be verified with more learnersand in different contexts. Trofimovich, Ammar,and Gatbonton (2007) and Sagarra (2007)obtained some conflicting results, and in bothstudies recasts were provided in computer modeand in discrete item practice. How workingmemory interacts with feedback in meaningfulcommunication is not clear. All five studiesinvestigated recasts, and how working memorycorrelates with the effects of other feedback typesneeds further empirical exploration. Also, inprevious research, working memory was eitheroperationalized as phonological short-termmem-ory, or when it was measured using complex,sentence-span tests, it was mainly the recallcomponent (not veracity judgment or reactiontime) that was scored. However, research showedthat there was a tradeoff between the storage andprocessing components of working memory (e.g.,Waters & Caplan, 1996).

THE CURRENT STUDY

This study examines how learners’ aptitudedifferences in language analytic ability and

638 The Modern Language Journal 97 (2013)

working memory interact with the effects ofimplicit and explicit feedback. The data werecollected as part of a larger study investigatingfactors constraining the effectiveness of correctivefeedback. The findings pertaining to the inter-actions between feedback type and proficiencyare reported in a separate article.1 Therefore it isnot a focus of this study to show the comparativeeffects of implicit and explicit feedback; rather,the primary concern here is the effect of theinterface between feedback type and aptitudecomponents on L2 learning. Consequently, thefollowing research questions are formulated:

1. What is the relationship between the effec-tiveness of implicit and explicit feedback andlearners’ individual differences in languageanalytic ability?

2. What is the relationship between the twofeedback types and learners’ individualdifferences in working memory?

3. Do the two aptitude components interactdifferently with the two feedback types?

Participants

Seventy-eight L2 Chinese learners aged 18–38(M ¼ 20.8) from two large U.S. universitiesparticipated in the study. Seventy-five were nativespeakers of English and 3 reported Korean astheir native language; 34 were female and 44male. At the time of data collection, they were intheir 4th, 6th, and 8th semesters of their Chinesestudy.2 The participants were assigned to one ofthree conditions: implicit (n ¼ 28), explicit(n ¼ 29), and control (n ¼ 21). These groupsreceived recasts, metalinguistic correction, or nofeedback, respectively, in response to their non-target-like L2 production. A standardized profi-ciency test (HSK) was administered to ensure thatthe three groups were comparable in theirproficiency in the L2; a one-way ANOVA showedno significant differences among the three groupsin their test scores, F (2, 75) ¼ .15, p ¼ .86. Thedescriptive statistics are displayed in Table 1.

Target Structure

Chinese classifiers served as the linguistic targetof the treatment tasks. A classifier is a word that isused between a determiner (which is typically anumber but can also be a demonstrative such asthis/that or a quantifier such as several) and a countnoun, as in liang ben shu (two CLASSIFIER books)or yı ke shu (one CLASSIFIER tree). The classifieris one of the most striking features of the Chineselanguage (Li & Thompson, 1981). Semantically, aclassifier is used to categorize and quantify a set ofobjects with the same or similar physical proper-ties or characteristics. Syntactically, “classifiers areunits of enumeration employed to mark count-ability; their occurrence makes the semanticpartitioning of nouns visible” (Wu & Bodomo,2009, p. 490).

The choice of classifier depends on theaccompanying noun, not the determiner. Theform–meaning mapping of a classifier is trans-parent: There is usually a one-to-one correspon-dence between a classifier and the related noun.However, there are situations where more thanone classifier is compatible with an object. Forinstance, there are two possible classifiers fordogs—zhı and tiao—and which is more appropri-ate is subject to controversy. Also, in Chinesethere is a general classifier (ge) that cansubstitute for a special classifier in many cases,which confuses L2 Chinese learners and partlyexplains why classifiers constitute a problematicstructure for learners. Typical learner errorsinclude failure to use a classifier in an obligatorycontext, misuse of classifiers, or use of thegeneral, default classifier ge in lieu of a specialclassifier.

The classifier was selected as the targetstructure because it is problematic for learnersat all levels of their interlanguage development;yet it is one of the earliest addressed structures inL2 Chinese instruction, so all learners had someprior knowledge of it. Structures about whichlearners have partial knowledge but not fullmastery are ideal for feedback treatment(Han, 2002). For learners whose first languageis a nonclassifier language, such as English (which

TABLE 1Descriptive Statistics for Proficiency Scores

Implicit Explicit Control

n Mean SD n Mean SD n Mean SD

28 29.86 7.50 29 29.52 8.34 21 30.81 9.83

Shaofeng Li 639

has measure words—such as in “a piece ofbread”—but not classifiers), classifier learninginvolves a two-step procedure: (a) They need to beaware that a classifier must be used between adeterminer and a noun, and (b) they need tomatch specific classifiers in the repertoire with thecorresponding nouns. In a sense, classifierlearning is both rule- and item-based.

Feedback Operationalization

Implicit feedback was operationalized as re-casts; that is, the reformulation of the learner’snon-target-like production of the target struc-ture. The recasts in this study were mostly partial,didactic, ended in a falling tone, and wereprovided in meaning-focused tasks where thetarget structure was attended to in informationexchange. Aside from the utterances containingthe target structure, utterances that subsumederrors related to non-target structures were alsoresponded to with recasts as well as otherfeedback types to mask the linguistic focus. Thefollowing episode, which was extracted from thedataset of this study, exemplifies how a recast wasprovided.

EXAMPLE 1. Recast: Incorrect Classifier

[Note. CL ¼ classifier]

In this episode, the learner used a wrongclassifier (ge) for pigs. The native speakerreformulated the noun phrase by replacing thewrong classifier with the correct one (tou). In thenext utterance, the learner repeated the correctclassifier and the noun, followed by a descriptivestatement about the pigs in the photo.

Following Sheen (2007), explicit feedback wasoperationalized as metalinguistic correction, thatis, the provision of the correct form followed by ametalinguistic clue. This operationalization ismotivated by the following factors. First, as Sheenpointed out, metalinguistic correction is poten-tially more effective than a metalinguistic cluealone because it provides positive evidence.Further support for Sheen’s operationalizationcomes from Ellis (2007), who suggested theprinciple of “bias for best,” (p. 340) that is,operationalizing a feedback type in a way thatmaximizes its potential effect. Second, a majorgoal of this study is to explore how the effects ofimplicit and explicit feedback interact with thetwo aptitude components. Combining explicitcorrection and metalinguistic feedback, twoexplicit feedback types, makes the resultingfeedback even more explicit, hence increasingthe implicit–explicit contrast.

An additional benefit of providing positiveevidence in both feedback groups was to controlfor the amount of modified output (uptake),which might affect the effectiveness of feedback(e.g., Lyster & Ranta, 1997). Metalinguisticfeedback without a model is a type of output-prompting feedback that imposes participatorydemands on the learner. Thus, the availability ofthe correct form in the feedback minimized thelikely influence of the confounding variable ofuptake. One might argue that due to the explicitnature of metalinguistic feedback, learners whoreceived this type of feedback might still haveproduced more uptake. However, there has beenlimited empirical evidence for the benefits ofuptake in facilitating SLA, and this was especiallytrue in laboratory settings (Mackey &Philp, 1998).

To be consistent with Chinese pedagogicalgrammar, the term measure word was usedinstead of classifier in the metalinguistic clue.Also, to ensure that the learner understood themetalinguistic clue, the information was providedin English, the learner’s L1. The followingepisode illustrates how explicit feedback wasprovided.

EXAMPLE 2. Explicit Feedback: MissingClassifier

640 The Modern Language Journal 97 (2013)

In the above scenario, the learner failed to use aclassifier between the numeral yı (‘a’) and thenoun he (‘river’). The native speaker reformu-lated the noun phrase by inserting the classifiertiao and then provided a metalinguistic clue. Thelearner repaired the non-target-like utterance inthe next turn.

Tasks

Two tasks were used to elicit the production ofclassifiers: picture description and spot the differ-ences. In both tasks, pictures were used, each ofwhich contained a scenario that created ameaningful context for the use of the targetstructure. In Task 1, the picture description task,the learner was asked to describe seven picturesthat contained 15 cases of classifier use. Thepictures had different numbers of various objects(such as two trees, a river, three horses) so that thelearners had to use classifiers when they describedthe objects and reported howmany of them therewere. In Task 2, the spot the differences task,there were three sets of pictures; each set had twopictures that contained more or less the sameitems but were different in a number of aspects.The native speaker and the learner each held apicture, and the learner asked questions toidentify the differences between the pictures.Completion of the task required the use of thesame 15 selected classifiers as in Task 1. The nativespeaker provided explicit or implicit feedback inresponse to the learner’s wrong classifier use. Thelearners in the control group were asked to read a

story about a Chinese idiom shu neng sheng qiao(‘Practice makes perfect’) and retell it by follow-ing some clues. Retelling the story did not requirethe use of the selected classifiers, and no feedbackwas provided in the control condition.3

The selection of classifiers was based on theresponses from 45 native speakers of Chinese to asurvey on classifier use. The respondents wereMandarin native speakers studying or working inthe local community where this study wasconducted. Twenty of these native speakers hada bachelor’s degree, 19 had amaster’s degree, and6 had a doctoral degree. Their specializationswere varied, including humanities, science, andengineering. The average age was 32.08. Thesurvey served two purposes. One was to selectappropriate classifier þ noun combinations fortreatment tasks. The other was to ensure that theselected special classifiers could not be replacedby the general classifier ge.

The survey had 40 items, each providing acontext for classifier use. For each item, therespondent was asked to fill in the missingclassifier and then decide whether the classifiercould be replaced by the general classifier. Thesurveyed classifiers were mostly selected from thetextbooks used in the Chinese programs the studyparticipants were enrolled in (e.g., Liuet al., 2009). Additional sources of classifierswere other commercial Chinese textbooks (e.g.,Wu et al., 2007) used in North America,Erbaugh’s (1986) list of core classifiers, and thewidely used Chinese grammar book by Li andThompson (1981). The example below shows asample item in the survey.

EXAMPLE 3. Survey Item

Altogether 15 cases of classifier use wereselected out of the 40 surveyed items. In orderto be eligible to be included in the study, aclassifier had to reach an agreement rate of 80%or higher among the respondents regarding thecollocation of the classifier with its accompanying

Shaofeng Li 641

noun and the inability to substitute the generalclassifier for the special one.

Testing and Scoring

The measures used in this study included aproficiency test, tests of treatment effects, alanguage analytic ability test, and a workingmemory test. The Appendix presents the detailson the different measures of the constructs underinvestigation, the number of items and possiblepoints for eachmeasure, and the related estimatesof internal reliability.

Proficiency. To ensure that the three partici-pant groups were comparable in their L2 profi-ciency, an adapted HSK (hanyu shuıpıng kaoshı or‘Chinese Proficiency Test’) test was administered.The HSK is a standardized test of Chinese as aforeign language sponsored by Beijing Languagesand Cultures University and recognized by thePeople’s Republic of China and numerous coun-tries worldwide. Previous research (e.g., Nie, 2006)has demonstrated that the test has high levels ofvalidity and reliability. The revisedHSKused in thisstudy consisted of 60 items: 30, 20, and 10 forlistening, grammar, and reading respectively. Eachitem was assigned 1 point, with a total score of 60.More weight was given to listening comprehensionand grammar than to reading comprehension toalign with the format of the interventionaltreatment, where feedback was provided orally tolinguistic errors in oral production.

Treatment Effects. A grammaticality judgmenttest (GJT) and an elicited imitation (EI) test wereused to measure the effects of feedback. The GJTand EI tests were used to measure learners’explicit and implicit knowledge about the targetstructures respectively (Ellis et al., 2009). Duringthe GJT, learners were asked to judge whether asentence was grammatical or ungrammatical orwhether they were not sure. If learners judged asentence to be ungrammatical, they were asked tolocate the error and correct it. During the EI test,learners listened to statements relating to theireveryday life or personal experience. The stimuli,which were read at normal speed by theresearcher and were recorded on an audio disc,were presentedmanually using a disc player. Aftereach statement the disc was paused to allowlearners to decide whether it was true or not truebased on their experience or whether they werenot sure. (An example stimulus, translated intoEnglish, would be “I bought three shirts yester-day.”) The learner was then asked to repeat eachstatement in correct Chinese.

Both theGJT and the EI test had three versions:a pretest, an immediate posttest, and a delayedposttest. Each version had 15 target items and 8distracters. Among the 15 target items, 8 wereungrammatical and 7 were grammatical. Thethree tests had the same target items but differentdistracting items, and the sequence in which theitems were presented was different across thetests. The sentence stimuli in the GJT weredifferent from those in the EI test except forthe obligatory contexts for the use of the targetstructure. In both tests, vocabulary annotation wasprovided for some key words, including thecharacters, the alphabetic transcription, and theEnglish translation. During the GJT, the learnerswere allowed to ask additional vocabulary ques-tions but not grammar-related questions.

The total possible score for each test was 15,with each item receiving 1 point. For GJT items,credit was given when a grammatical sentence wasjudged to be grammatical, and when an ungram-matical sentence was judged to be ungrammaticaland the error was corrected. Credit was also givenwhen a grammatical item was judged to beungrammatical but the correction was made ona part that was unrelated to the target structure.The scoring criteria for the EI tests were different.Credit was given when the target structure wassupplied in obligatory contexts. This meant thatno credit was given if the target structure wassupplied but the context for the use of thestructure was not established (e.g., only repeatinga classifier in the original sentence withoutproducing the accompanying noun); it alsomeant that scoring only focused on the use ofthe target structure and the rest of a reproducedsentence was ignored. Also, the purpose of an EItest is to measure a learner’s implicit knowledge,which is supposedly unconscious and automatic.Therefore cases containing self-correction, whichshowed the learner’s conscious processing of thetarget structure, did not receive credit.

Language Analytic Ability. Language analyticability was measured via the Words in Sentencessubtest of the MLAT (Carroll & Sapon, 2002), awidely used aptitude test in SLA research. Thesubtest was used to measure language learners’sensitivity to grammatical structures or the “abilityto handle the grammatical aspects of a foreignlanguage” (Carroll & Sapon, 2002, p. 3). In eachitem, a key sentence was provided where a certainpart was underlined, and the key sentence wasfollowed by one or more comparison sentenceswith five underlined parts. Test takers chose thepart in the comparative sentence(s) that matched

642 The Modern Language Journal 97 (2013)

the function of the designated part in the keysentence. The test had 45 items; learners weregiven 15 minutes to complete the test. One pointwas assigned for each item, so the total possiblescore was 45.

Working Memory. A listening span test wasdeveloped to measure the learners’ workingmemory capacities. The rationale behind thedecision to use a listening span rather than areading span test was that the instructionaltreatment involved oral feedback, which did notdraw on learners’ ability to store and processvisual stimuli. The test was created using thestimuli developed and validated by Waters andCaplan (1996). It contained 72 sentences dividedinto 4 sets of sentences at span sizes 3, 4, 5, and 6.Half of the sentences had verbs that requiredanimate subjects and half contained verbs thatrequired inanimate subjects. Half of the senten-ces were plausible; half were implausible. Implau-sible sentences were constructed by inverting theanimacy of the subject and object noun phrases(e.g., “It was the woman that the fur coatdesired”). The sentences were of four types: cleftsubject, cleft object, object–subject, and subject–object.4 All four sentence types and twoplausibilitypossibilities (“Good” or “Bad”) were evenlydistributed among the test stimuli, and ineach set, there was a mixture of sentences withdifferent structures and plausibility possibilities.The sequence in which sentence sets of differentspan sizes were presented was randomized.

During the test, the learner listened to eachsentence in a set and decided whether it wasplausible; that is, whether it was about somethingthat could happen in the real world. When thewhole set was finished, there was a pause, duringwhich the learner recalled the final word of eachsentence in that set and wrote down the words ona blank sheet before starting the next set. Thelearner was informed that reaction time, plausi-

bility judgment, and word recall were equallyimportant. Unlike previous studies that onlyincluded recall scores, this study also recordedreaction time and plausibility scores because WMcapacity should involve both the processing andstorage functions and because previous studies(Leeser, 2007; Waters & Caplan, 1996) showedthat learners sacrificed one component for abetter performance in another (such as whenlearners process more slowly in order to achievemore accuracy in word recall). In data analysis,the working memory score for each participantwas the average of the z scores for the threecomponents—reaction time, plausibility judg-ment, and recall of sentence-final words.

Procedure

Each participant attended three one-on-onesessions with a native speaker (the researcher). Insession 1, participants took the HSK proficiencytest and the GJT pretest. Session 2 started with theEI pretest, after which the learner receivedfeedback (implicit or explicit) from the nativespeaker interlocutor (the researcher) on his/hernon-target-like classifier use in dyadic interaction;the instructional treatment was followed by theimmediate GJT and EI posttests. During the thirdand final session (seven days after session 2), thelearner took the delayed GJT and EI posttests, thetest of language analytic ability (Part IV of theMLAT), and the working memory test. Table 2displays the tasks the participants performed inthe three sessions and the approximate durationof each of the tasks.

Analysis

In order to verify the hypothesis that workingmemory and language analytic ability underlaythe same construct (i.e., language aptitude), a

TABLE 2Procedure of the Study

Session 1 Session 2 Session 3

Task Duration Task Duration Task Duration

HSK test 50 min EI pretest 10 min • EI posttest 2 10 minGJT pretest 15 min Treatment tasks 40–45 min • GJT posttest 2 15 min

• Picture description• Spot the difference

EI posttest 1 10 min • Test of LAA 15 minGJT posttest 1 15 min • WM test 15 min

Note . HSK test: test of proficiency; GJT: grammaticality judgment test; EI: elicited imitation; LAA: languageanalytic ability; WM: working memory.

Shaofeng Li 643

principal components analysis (using the DirectOblimin rotation method) was performed, fol-lowed by a maximum likelihood structuralequation modeling (SEM) analysis.5 Included inboth analyses were learners’ scores on theproficiency test, the GJT and EI pretests, thetest of language analytic ability, and the workingmemory test. If the two cognitive variables tappedinto the same construct, they would cluster underthe same latent variable or factor in the principalcomponents analysis. The SEM analysis wasperformed to confirm the identified factorsolution and to ascertain how well the data fitthemodel, with an additional benefit of exploringthe relationship between the identified latentvariables.

Next, two series of multiple regression analyseswere conducted. In the first group of analyses, thedependent variables were the immediate anddelayed GJT and EI gain scores of all theparticipants (not of individual groups);the independent variables were the two aptitudecomponents (continuous variables) and the twodummy variables (categorical variables)6 thatwere related to the effects of the two types offeedback vis-a-vis the control group. A significantcoefficient (b) for a dummy variable indicates thatthe related feedback group significantly out-performed the control group, and the b isequivalent to the mean difference between thetwo groups involved (or, more precisely, the effectsize related to the group difference given that thecoefficient was standardized). A multiple regres-sion analysis including both categorical andcontinuous variables is essentially the same asan analysis of covariance (ANCOVA) wherecategorical variables serve as independent varia-bles and continuous variables as covariates(Field, 2005). However, an ANCOVAhas a slightlydifferent focus: The researcher is often interestedin the main effects and the post hoc analyses

related to the categorical variable(s) rather thanthe covariates. A decision was made to conductmultiple regression analyses instead of ANCOVAsbecause the purpose was to obtain an overallpicture about the effects of feedback (comparedwith no feedback) and the weights of the twoaptitude components after the effects of feedbackwere held constant; of interest were the uniqueand combined effects of all the includedpredictors.

While the above analyses yielded useful infor-mation, the substantive analyses were the secondgroup of regression analyses. In each of theanalyses, the predictor variables were the twoaptitude components, and the dependent varia-bles were the GJT and EI gain scores (immediateand delayed) of each feedback group (implicit/explicit). The purpose of the analyses was toascertain the differential contributions of the twovariables to the effects of implicit and explicitfeedback. The gain scores of the control groupwere irrelevant in this case.

RESULTS

Principal Components Analysis and SEM Analysis

The principal components analysis and theSEM analysis were conducted to verify thehypothesis that working memory tapped intothe same construct as language analytic ability;that is, working memory was an aptitude compo-nent. Table 3 reports the descriptive statistics forthe observed variables included in the analyses.

The principal components analysis showed aclear two-factor solution: Proficiency, GJT, and EIloaded onto the same factor, which was labeled L2Competence; language analytic ability and work-ing memory loaded onto another factor, whichwas named Aptitude. The two factors explained66% of the total variance (see Table 4). The SEM

TABLE 3Descriptive Statistics for Variables in the Principal Components Analysis and the Structural EquationModelingAnalysis

Variables N Mean SD

Proficiency (HSK) 78 29.99 8.39GJT pretest 78 5.78 1.26EI pretest 78 3.24 2.22Language analytic ability 77 24.25 6.37Working memory (average z scores) 77 .01 .74

• Reaction time (milliseconds) 3769.53 523.63• Plausibility judgment 63.64 5.27• Recall 50.79 9.84

Note . GJT grammaticality judgment test; EI: elicited imitation test.

644 The Modern Language Journal 97 (2013)

analysis confirmed the two-factor model and atthe same time identified a causal path fromAptitude to L2 Competence. The data showed anacceptable fit to the identified model, x2 ¼ 1.25,df ¼ 4, p ¼ .87 (goodness-of-fit indices: RMSEA¼ .00, NFI ¼ .98, CFI ¼ 1.00, TLI ¼ 1.17).Figure 1 illustrates the results of the SEM analysis;the value on each of the arrows represents thestandardized regression coefficient associatedwith the path. Taken together, the resultsindicated that working memory and languageanalytic ability underlay the same latent variable,which was predictive of L2 competence, the otherlatent variable in the model.

Regression Analyses

Table 5 reports the descriptive statistics for theseparate and combined pretest–posttest gains ofthe three participant groups. Gain scores wereobtained by subtracting pretest scores fromposttest scores. It can be observed that all threegroups improved from the pretests to theposttests, as shown by the positive gain scores.

The explicit group performed better than theimplicit group, which in turn improvedmore thanthe control group. The gap between the twoexperimental groups in their delayed gain scoresappeared smaller than in the immediate gains.

To obtain an initial, holistic picture of theimpact of the two types of feedback and of the twoaptitude components after controlling the effectsof feedback, the gain scores of all participants (allthree groups) were subjected to multiple regres-sion analyses using a stepwise variable entrymethod. Table 6 shows the standardized regres-sion coefficient (b) and significance value foreach predictor as well as the R2 value for eachregression model. A standardized coefficientrefers to the change in the outcome in standarddeviation units as a result of one standarddeviation unit change in the predictor. In thecase of a dummy variable, the coefficient indexesthe change in outcome as a result of the switchbetween the two involved groups (i.e., fromcontrol to explicit/implicit). R2 represents thepercentage of variance in the response variable

TABLE 4Results of the Principal Components Analysis

Observed Variables

Factor Loading

L2Competence Aptitude

Proficiency .85Elicited Imitation .84Grammaticality Judgment .72Working Memory .85Analytic Ability .75

Note . Variance explained: L2 Competence ¼ 43%;Aptitude ¼ 23%; total ¼ 66%.

FIGURE 1A Structural Model for Language Aptitude and L2 Competence

L2 Competence

Proficiency

GJT

EI

WM

LAA

.38

.39

.78

.81

.48

.83

L2 Aptitude

Note. WM ¼ working memory; LAA ¼ language analytic ability; GJT ¼ grammaticality judgment test;EI ¼ elicited imitation; .39 (and other numbers) ¼ standardized regression coefficient.

TABLE 5Descriptive Statistics for Pretest–Posttest Gain Scores

Posttest Group n

GJT EI

M SD M SD

1 A 78 3.28 2.95 4.40 2.95I 28 3.23 2.17 4.62 2.35E 29 5.31 2.82 6.43 2.42C 21 0.54 1.52 1.30 1.34

2 A 78 2.85 2.70 4.00 2.79I 28 3.19 2.25 4.08 2.51E 29 4.46 2.39 5.50 2.81C 21 0.26 1.55 1.90 1.63

Note . GJT ¼ grammaticality judgment test;EI ¼ elicited imitation test; A ¼ all three groupscombined; I ¼ implicit; E ¼ explicit; C ¼ control.

Shaofeng Li 645

accounted for by the identified regression model(e.g., R2 ¼ .23 means 23% of the variance in theresponse variable was accounted for). The twodummy variables (DumEx for the explicit vs.control comparison; DumIm for the implicit vs.control comparison) were significant predictorsfor gain scores on all measures (GJT and EI;immediate and delayed). In other words, learnersin the two feedback conditions showed signifi-cantly more gains than the control condition onall measures. By and large, explicit feedbackshowed larger coefficients than implicit feedback.After the effects of feedback were controlled,working memory was predictive of the delayedgains, and language analytic ability was a nearsignificant predictor (p ¼ .06) of the delayed GJTgain scores.

While the above results show how differentconfigurations of feedback and aptitude affectedthe gains of all learners including the controlgroup, the interactions between feedback andaptitude, which were of primary interest in thisstudy, remain unclear. To determine whethersuch interactions exist, regression analyses wereperformed using the gain scores for the twoexperimental groups as response variables andthe two aptitude components as predictors. Theresults, which appear in Table 7, are summarizedas follows:

(a) All statistically significant effects related tothe delayed gain scores.

(b) Language analytic ability was a significantpredictor for the GJT gain scores of theimplicit group, b ¼ .44, p < .05. A total of20% of the variance was explained. Nosignificant relationships were found be-tween language analytic ability and theeffects of explicit feedback.

(c) Working memory was predictive of theeffects of explicit feedback, and the resultwas found for both the GJT scores(b ¼ .56, p < .01) and the EI scores(b ¼ .38, p < .05). Altogether 30% of thevariance in the GJT scores and 14% of thevariance in the EI scores was accounted for.Also, working memory was not significantlyrelated to the effects of implicit feedback.

DISCUSSION

In response to the call for more research intothe mediating variables of corrective feedback(Ellis, 2010) and into aptitude–treatment interac-tion (Robinson, 2005), this study investigatedwhether two components of language aptitude—

TABLE 6Regression Results for the Effects of Feedback and Contributions of Language Analytic Ability and WorkingMemory

Tests Timing

Predictors

R2

DumEx DumIm LAA WM

b p b p b p b p

GJT Posttest 1 .81�

.00 .41�

.00 .10 .26 .14 .13 .46Posttest 2 .77

�.00 .48

�.00 .17 .06 .27

�.00 .51

EI Posttest 1 .84�

.00 .51�

.00 .05 .63 .11 .26 .47Posttest 2 .62

�.00 .31

�.02 .09 .41 .25

�.02 .31

Note .�p < .05; DumEx ¼ dummy variable representing the explicit-control comparison; DumIm ¼ dummy

variable representing the implicit-control comparison; LAA ¼ language analytic ability; WM ¼ workingmemory; b ¼ standardized regression coefficient; GJT ¼ grammaticality judgment; EI ¼ elicited imitation;R2 ¼ percentage of variance accounted for.

TABLE 7Regression Results Pertaining to Feedback–AptitudeInteractions

Feedback Test Timing

Predictors

R2

LAA WM

b p b p

Implicit GJT Posttest 1 .25 .27 .11 .61 .10Posttest 2 .44� .02 .24 .25 .20

EI Posttest 1 .22 .33 .02 .92 .05Posttest 2 .11 .63 .26 .25 .10

Explicit GJT Posttest 1 .04 .83 .18 .39 .04Posttest 2 .01 .66 .56� .00 .30

EI Posttest 1 .03 .88 .07 .74 .01Posttest 2 .19 .33 .38� .04 .14

Note . �p < .05; GJT ¼ grammaticality judgment test;EI ¼ elicited imitation test; LAA ¼ languageanalytic ability; WM ¼ working memory;b ¼ standardized regression coefficient;R2 ¼ amount of variance accounted for.

646 The Modern Language Journal 97 (2013)

language analytic ability and working memory—played different roles in mediating the effects ofimplicit and explicit feedback. Initial regressionanalyses performed on the contributions of thetwo types of feedback and the two aptitudecomponents to the learners’ pretest–posttestgains established that both explicit and implicitfeedback were facilitative of the learners’ inter-language development of Chinese classifiers; afterthe effects of feedback were controlled, workingmemory explained a significant portion of thevariance of the learners’ gains scores aftertreatment, but language analytic ability did not.However, subsequent analyses mapping therelationships between the two feedback typesand the two aptitude components showed that(a) both aptitude components were significantpredictors, and (b) language analytic ability wassensitive to the effects of implicit feedback andworking memory to the effects of explicitfeedback. Also noteworthy is the impact of testingon the results: All significant results related to thedelayed gains and most of them related to GJTscores. In the following, interpretations aresought for the interactions between the aptitudecomponents and the learning conditions andfor the influence of testing on the aptitude–treatment interactions.

Language Analytic Ability

Language analytic ability was sensitive to theeffects of implicit feedback that contained thecorrect classifier without metalinguistic informa-tion (recasts). It would seem that in the absence ofmetalinguistic information, learners with higheranalytic ability achieved more. These learnerswere better versed in (a) noticing linguisticproblems and (b) extracting and generalizingthe syntactic regularities related to classifier usebased on the positive and/or negative evidencecontained in the provided recasts. However, thisinterpretation is subject to two concomitantquestions:

(a) Did the learners engage in syntacticprocessing given the implicitness of thefeedback?

(b) Was language analytic ability drawn upongiven that classifiers constitute a simple,transparent structure?

With regard to the first question, despite thefact that the feedback did not overtly draw thelearners’ attention to errors, the saliency andtransparency of this linguistic structure, the

instructional context (laboratory), and the char-acteristics of the recasts (partial and didactic)might have made the corrective force of recastsmore easily perceived than in other studies ofrecasts. Robinson (1997) found that in theimplicit condition of his study where learnerswere asked to simply memorize some exampleswithout being provided with any rule explanation,learners with high aptitude claimed to haveactively looked for and were able to verbalizerules. In Long, Inagaki, and Ortega (1998),learners were able to explicitly formulate rulesabout the target structure as a result of receivingrecasts. These findings are proof that learners inimplicit learning conditions engaged in rulesearch or induction, which taxed their languageanalytic ability. With regard to the secondquestion, although the classifier is not a complex(or hard) structure, it does pose problems fornative speakers of English, a nonclassifier lan-guage. This is confirmed by Polio’s (1994) datashowing that native speakers of English commit-ted omission errors in using Chinese classifiers,but that native speakers of Japanese, a classifierlanguage, did not. For speakers of languageswithout classifiers, the mastery of classifiersnecessarily involves the initial recognition of thesyntactic permutation (e.g., numeral þ classifierþ noun) prior to the semantic matching betweenspecific classifiers and the accompanying nouns.

It would seem that whether language analyticability influences the effects of implicit feedback(recasts) is also constrained by the extent to whichthe linguistic target is within learners’ processingcapacity. This speculation is supported by theconflicting findings obtained in feedback studies.Structures such as classifiers in this study and theEnglish possessive determiners (his/her) in Trofi-movich, Ammar, and Gatbonton’s (2007) studydid not involve complex form–meaningmapping,which made it possible for the learners to solveproblems by utilizing their internal resources. Inthe case of opaque, hard structures, such asEnglish articles in Sheen’s (2007) study, learnerswere likely unable to extract rules about the targetstructure using their own analytic ability (even ifthere was a high level of noticing). Consequently,the effects of recasts were found to be related tolanguage analytic ability in this study andTrofimovich, Ammar, & Gatbonton’s study, butnot in Sheen’s study.

Consideration of the nature of the linguistictarget also helps explain why language analyticability was not related to the gains in the explicitcondition. The classifier is a relatively transparentstructure; the metalinguistic information

Shaofeng Li 647

contained in the explicit feedback (which stipu-lated that a classifier was required between anumeral and a noun) was easy to process andinternalize. As a result, the learners may havebeen relieved of the need to apply their languageanalytic ability. Therefore, while language analyticability made a difference in the absence ofmetalinguistic information in the implicit condi-tion, it is the provision of the metalinguisticinformation that leveled out the role of languageanalytic ability in the explicit condition. InSheen’s (2007) study, however, a significantcorrelation was detected between language ana-lytic ability and the effects of explicit feedback(metalinguistic correction) in the learning ofEnglish articles (a/the). This further testifies tothe role of the linguistic target: Language analyticability was drawn upon in processing the metalin-guistic information about a hard, opaquestructure.

Based upon the available empirical evidencefrom this study and previous studies, the followinghypothesis can be formulated regarding theinteraction between language analytic abilityand different learning conditions:

Other things being equal, language analytic ability isimplicated in implicit conditions in the learning ofeasy, transparent structures that are within one’sprocessing capacity, and in explicit conditions in thelearning of hard, opaque structures where theinternalization of available metalinguistic informa-tion sets heavy processing demands on internalcognitive resources.

Clearly the hypothetical claim is debatablebecause of potential problems such as the field’sinconsistency in operationalizing implicitness/explicitness and controversy over how linguisticdifficulty/complexity is determined. Therefore,the falsifiability of the hypothesis is dependent onthe extent to which related constructs are clearlyand consistently defined and theoreticallyjustified.

Working Memory

Working memory was predictive of the effectsof the explicit feedback. The processing demandsof classifier learning through external assistancein the form of metalinguistic correction seemed aperfect match to the mechanism of workingmemory. When the learner’s attention wasbrought to the target structure through theprovided feedback, he/she encoded and regis-

tered the auditory stimuli (sound representationsabout a classifier as well as the metalinguisticinformation) in the phonological loop, matchingthe phonological codes with existing codes (e.g.,sounds and tones the learner previously learned)archived in long-term memory. This was likelyfollowed by vocal or subvocal rehearsal of thestored information (e.g., repetition of the providedclassifier or uptake). The central executivemaintained the information in focal attentionand processed it for storage in long-termmemorythrough the episodic buffer. The cognitiveprocessing may have taken place by matching acertain classifier with a noun and analyzing themetalinguistic information; it may also haveinvolved the inhibition of other classifiers in therepertoire, which likely competed for the limitedcapacity of working memory. Evidently, classifierlearning in the explicit condition drew heavily onthe learner’s ability to store and process theavailable input, which led to the significantrelationship between working memory and thetreatment effects.

The finding that working memory was sensitiveto the effects of explicit but not implicit feedbackmay have to do with consciousness. Almost allmodels of working memory, such as the MultipleComponentModel (Baddeley & Logie, 1999), theExecutive AttentionModel (Engle, 2002), and theEmbedded Process Model (Cowan, 1999),acknowledge the role of consciousness andattention control. Baddeley (2007) pointed outthat “as has become increasingly obvious over theyears, conscious awareness appears to be closelyrelated to the executive control, and hence to theoperation of working memory” (p. 302). Engle(2002) even stated that working memory is notabout short-term span; rather, it is about theability to focus attention on relevant informationand inhibit irrelevant information. Similarly, Ellis(2009) observed that implicit learning does notimplicate central attentional resources; explicitlearning, by contrast, relies heavily on workingmemory because it involves conscious memoriza-tion of facts. Indeed, in this study, learners’ abilityto focus their attention on the informationcontained in the explicit feedback and at thesame time resist competing information may becritical to the development of their knowledgeabout classifier use.

The finding that language analytic ability, butnot working memory, mediated the effects ofimplicit feedback and that working memory, butnot language analytic ability, was related to theeffects of explicit feedback, demonstrated thedifferent processing demands the two learning

648 The Modern Language Journal 97 (2013)

conditions imposed on the learners’ cognitiveresources. As previously stated, classifier learninginvolves an initial recognition of the syntacticpermutation followed by the semantic mappingbetween individual classifiers and their corre-sponding nouns. In the implicit condition, whereno metalinguistic information was available,learners’ ability to notice, process, and consoli-date the syntactic pattern of classifier use seems tohave played a greater role than the subsequentprocessing and storage of individual classifiers. Incontrast, in the explicit condition, where infor-mation was available about the syntactic compo-nent of the target structure, learners’ ability toencode, rehearse, and store individual classifiersand simultaneously to suppress similar classifiersbecame more important.

Aptitude and Testing

It is interesting that the feedback–aptitudeinteractions found in this study are subject tothe timing of testing: All significant effects wererelated to the delayed gains. The relation ofaptitude measures to the delayed effects ofinstructional treatment is consistent with thefindings of previous research (Erlam, 2005;Mackey et al., 2002; Trofimovich, Ammar, &Gatbonton, 2007). It is not clear why this is so, butresearchers have made some reasonable spec-ulations, which boil down to two themes: Theimmediacy of the first posttests leveled out therole of aptitude, and aptitude “contributed to thecapacity to build on initial exposure duringtraining, and continue to learn during theposttests” (Robinson, 2002, p. 204).

Also, the significant results were mainly re-flected through measures of explicit knowledge(i.e., the GJT tests). Although working memorywas also related to the EI scores in the explicitcondition, the gains under this condition mayhave resulted more from item–learning thansystem–learning, as evidenced by a lack ofrelationship between this type of feedback andlanguage analytic ability. Consequently, thetapped knowledge was likely more lexical thansyntactic; and lexical knowledge is largely explicit(Dornyei, 2009). The finding is not surprisinggiven that the measures of both cognitivevariables involved conscious linguistic processing.According to Ranta’s (2005) review, most signifi-cant correlations between aptitude or aptitudecomponents and instructional treatments werefound for measures of explicit knowledge, and aswas evident from her own study, a measure oflanguage analytic ability was not related to oral

fluency, an important dimension of implicitknowledge. Revesz (2012) also found that work-ing memory related to learners’ GJT scores andwritten production but not oral production.Thus, there seems to be a need to include inaptitude tests a measure of the capacity to acquireimplicit knowledge.

CONCLUSION

This study constitutes the first empiricalattempt to investigate the relationship betweenfeedback type and aptitude components. It wasfound that language analytic ability impacted thelearning resulting from implicit feedback andworkingmemory influenced the effects of explicitfeedback; the significant relationships pertainedto delayed posttest scores and explicit knowledge.The findings showed the need for an integrated,situated approach to the role of language aptitudein SLA. They underscore the importance ofexploring aptitude–treatment interaction (Snow,1991) and provide further justification for thenecessity of taking a componential rather than amonolithic approach to aptitude research(Dornyei & Skehan, 2003; Robinson, 1997,2002, 2005). Clearly, the idiosyncratic character-istics of each learning condition (which aremolded by feedback type and perhaps also thenature of the linguistic target) set differentprocessing demands on learners’ cognitive abili-ties, hence the resultant contingent relationshipsbetween the two aptitude components and thetwo feedback types.

The study was conducted in a highly controlledlaboratory setting, where the interference ofpotential distracting variables was minimized.This is critical to aptitude research because anunderlying premise for the role of aptitude is that,all other things being equal, learners with higheraptitude learn more and faster. Without control-ling the noise from other factors, the effects ofaptitude could not have been clearly observed andprecisely interpreted. Also, a series of moveswere taken to ensure methodological rigor:Reliable measures were used, the treatment taskswere carefully developed, treatment effects weremeasured by using tests of both explicit andimplicit knowledge, and robust statistical proce-dures were employed.Using pretests and posttestsmade it possible to examine the impact ofaptitude on the gains as a result of treatment, asaptitude concerns the ability to learn rather thanthe ultimate outcome without controlling learn-ers’ prior knowledge (in the absence of a pretest).Language aptitude relates to a transition theory

Shaofeng Li 649

(development between point A and point B), notthe amount of stored knowledge at fixed points.Thus, the appropriateness of investigating thecontribution of aptitude to learners’ storedknowledge or ultimate outcome at fixed timepoints in some previous research (such ascorrelating aptitude scores with pretest andposttest scores or proficiency scores rather thangain scores) is questionable.

Further research including replications iswarranted to verify, confirm, or dispute thefindings of the current study. Replications areparticularly valuable in aptitude research (as wellas other lines of SLA research) given theheterogeneity of instructional contexts and in-consistency in construct operationalization. Forexample, even in the few previous studiesexamining the mediating role of aptitude inaffecting the effectiveness of feedback, workingmemory and language analytic ability weremeasured in different ways. Working memoryhas been measured by means of listening spantests (Mackey et al., 2002; Mackey & Sachs, 2012),reading span tests (Revesz, 2012; Sagarra, 2007),and a number–letter recall test (Trofimovich,Ammar, & Gatbonton, 2007). Measures of analyticability included a Dutch version of the adaptedWords in Sentences subtest of the MLAT(DeKeyser, 1993), a French version of the subtest(Trofimovich, Ammar, &Gatbonton, 2007), and alanguage analysis test developed in an artificiallanguage (Sheen, 2007). Methodological dispar-ities between studies make their results hardlycomparable and make it difficult to reach anyconclusions.

It is also necessary to carry outmore studies thatinclude other aptitude components, such asphonemic coding ability, to explore the uniqueand combined effects of multiple factors on L2achievements under different learning and in-structional conditions. Furthermore, it is worth-while to examine the role of the linguistic targetin mediating the relationship between aptitudecomponents and learning conditions. The natureof the target structure was resorted to in account-ing for the discrepancies between the findings ofthis study and those of previous research, but todate there has been no empirical research thatincluded it as an independent variable.

ACKNOWLEDGMENTS

I would like to express my gratitude to the followingindividuals for the help and support they providedme in

various aspects of the project the article is based on: RodEllis, Susan Gass, Xiaoshi Li, Shawn Loewen, Roy Lyster,Jenefer Philp, Leila Ranta, Patti Spinner, Hong Wang,and Paula Winke. My thanks are also due to theinstructors of Chinese at Michigan State University(Liren Shi, Taiheng Shi, Chunhong Teng, and Qion-gyao Wang) and the University of Michigan (QinghaiChen, Laura Grande, Wei Liu, Le Tang, and HaiqingYin) for their assistance with data collection. Also, thearticle has benefited enormously from the insights ofthe anonymous reviewers and Heidi Byrnes, editor ofthe Modern Language Journal. I am solely responsible forany limitations and errors.

NOTES

1 The results on the comparative effects of explicitand implicit feedback were reported in another study(Li, 2014), which investigated the interactions betweenfeedback type and proficiency. It was found that explicitfeedback was more effective than implicit feedback forlow-level learners, but the two types of feedback wereequally effective for more advanced learners; explicitfeedback showed an initial advantage, but the effects ofimplicit feedback were better maintained.

2 Although the participants were from different levelsof classes, the influence of proficiency was minimized byassigning learners from all levels to each participantgroup and ensuring that there were no significantdifferences among the three groups in their test scoreson theHSK test. There were also no significant between-groupdifferences in their pretest scores on classifier use.An ideal scenario would have been one in which all theparticipants were recruited from the same level ofclasses, but this was not possible due to logisticconstraints.

3 The control group performed a different task andtherefore received some placebo treatment. Therefore,essentially they only took the pretests and posttests, as inmany feedback studies (e.g., Ellis, Loewen, &Erlam, 2006). However, it must be admitted that theeffects of feedback would have been better disentangledif a comparison group had been included thatperformed the same task as the experimental groups.

4 The sentence stimuli have the following structures:• It was the woman that ate the apple. (cleft subject:

CS)• It was the damaged car that the mechanic fixed.

(cleft object: CO)• The police arrested the man that punched his dog.

(object–subject: OS)• The story that the man told amused the audience.

(subject–object: SO)These sentences differ in number of propositions andsyntactic complexity. CS and CO sentences have oneproposition, but OS and SO sentences have two. CS andOS sentences involve canonical assignment of thematicroles (Agent þ Theme) and are therefore easier toprocess than CO and SO sentences.

650 The Modern Language Journal 97 (2013)

5 An anonymous reviewer pointed out that a SEManalysis requires a large sample size. Bentler and Chou(1987) stated that 10 subjects per indicator variable wasan acceptable ratio. The SEM analysis in this studyincluded 5 indicator variables and was based on datacontributed by 78 subjects. Therefore, the sample size,while not large, was appropriate in this case.

6 The two categorical variables were named DumExand DumIm, representing the explicit–control contrastand the implicit–control contrast respectively. Zeros andones were used to code the variables. For the DumExvariable, 1 was assigned to the explicit group, and 0 tothe other groups (implicit and control); for the DumImvariable, 1 was assigned to the implicit group, and 0 tothe other two groups.

REFERENCES

Alderson, J. C., Clapham, C., & Steel, D. (1997).Metalinguistic knowledge, language aptitudeand language proficiency. Language TeachingResearch, 1, 93–121.

Ammar, A., & Spada, N. (2006). One size fits all? Recasts,prompts, and L2 learning. Studies in SecondLanguage Acquisition, 28, 543–574.

Baddeley, A. (2007).Working memory, thought, and action.Oxford: Oxford University Press.

Baddeley, A., & Logie, R. (1999). Workingmemory: Themultiple–component model. In A. Miyake & P.Shah (Eds.), Models of working memory: Mechanismsof active maintenance and executive control (pp. 28–61). Cambridge: Cambridge University Press.

Benter, P. M., & Chou, C. P. (1987). Practical issues instructural modeling. Sociological Methods and Re-search, 16, 78–117.

Carroll, J. B. (1981). Twenty-five years of research onforeign language aptitude. In K. C. Diller (Ed.),Individual differences and universals in languagelearning aptitude (pp. 83–118). Rowley, MA: New-bury House.

Carroll, J. B., & Sapon, S. (1959). Modern languageaptitude test. New York: The Psychological Corpo-ration/Harcourt Brace Jovanovich.

Carroll, J. B., & Sapon, S. (2002). Manual for the MLAT.Bethesda, MD: Second Language Testing.

Conway, A., Jarrold, C., Kane, M., Miyake, A., & Towse, J.(Eds.). (2007).Variation in workingmemory. Oxford:Oxford University Press.

Cowan, N. (1999). An embedded-process model ofworking memory. In A. Miyake & P. Shah (Eds.),Models of working memory (pp. 62–101). Cambridge:Cambridge University Press.

Daneman, M., & Carpenter, P. (1980). Individual differ-ences in working memory and reading. Journal ofVerbal Learning and Verbal Behavior, 19, 450–466.

DeKeyser, R. (1993). The effect of error correction onL2 grammar knowledge and oral proficiency.Modern Language Journal, 77, 501–514.

DeKeyser, R., & Koeth, J. (2011). Cognitive aptitudes forsecond language learning. In E. Hinkel (Ed.),

Handbook of research in second language teaching andlearning (pp. 395–406). New York/London:Routledge.

Dornyei, Z. (2009). The psychology of second languageacquisition. Oxford: Oxford University Press.

Dornyei, Z., & Skehan, P. (2003). Individual differencesin second language learning. In D. Catherine& L. Michael (Eds.), Handbook of secondlanguage acquisition (pp. 589–630). Malden, MA:Blackwell.

Doughty, C., & Varela, E. (1998). Communicative focuson form. In C. Doughty & J. Williams (Eds.), Focuson form in classroom second language acquisition(pp. 114–138). Cambridge: Cambridge UniversityPress.

Egi, T. (2007). Recasts, learners’ interpretations, and L2development. In A. Mackey (Ed.), Conversationalinteraction in second language acquisition: A collectionof empirical studies (pp. 249–267). Oxford: OxfordUniversity Press.

Ehrman, M., & Oxford, R. (1995). Cognition plus:Correlates of language learning success. ModernLanguage Journal, 79, 67–89.

Ellis, R. (2007). The differential effects of correctivefeedback on two grammatical structures. In A.Mackey (Ed.), Conversational interaction in secondlanguage acquisition (pp. 339–360). Oxford:Oxford University Press.

Ellis, R. (2009). Implicit and explicit learning, knowl-edge and instruction. In R. Ellis, S. Loewen, C.Elder, R. Erlam, J. Philp, & H. Reinders (Eds.),Implicit and explicit knowledge in second languagelearning, testing and teaching (pp. 3–25). Bristol, UK:Multilingual Matters.

Ellis, R. (2010). Cognitive, social, and psychologicaldimensions of corrective feedback. In R. Batstone(Ed.), Sociocognitive perspectives on language use andlanguage learning (pp. 151–165). Oxford: OxfordUniversity Press.

Ellis, R., Loewen, S., Elder, C., Erlam, R., Philp, J., &Reinders, H. (Eds.). (2009). Implicit and explicitknowledge in second language learning, testing andteaching. Bristol, UK: Multilingual Matters.

Ellis, R., Loewen, S., & Erlam, R. (2006). Implicit andexplicit corrective feedback and the acquisition ofL2 grammar. Studies in Second Language Acquisition,28, 339–368.

Engle, R. (2002).Workingmemory capacity as executiveattention. Current Directions in Psychological Science,11, 19–23.

Erbaugh, M. (1986). Taking stock: The development ofChinese noun classifiers historically and in youngchildren. In C. Craig (Ed.), Noun classes andcategorization (pp. 399–436). Philadelphia/Amster-dam: John Benjamins.

Erlam, R. (2005). Language aptitude and its relation-ship to instructional effectiveness in secondlanguage acquisition. Language Teaching Research,9, 147–171.

French, L. (2006). Phonological working memory and secondlanguage acquisition: Developmental study of Franco-

Shaofeng Li 651

phone children learning English in Quebec. New York:Edwin Mellen Press.

Field, A. (2005). Discovering statistics using SPSS. Thou-sand Oaks, CA: SAGE.

Gardner, R. C., & Lambert, W. E. (1965). Languageaptitude, intelligence, and second-languageachievement. Journal of Educational Psychology, 56,191–199.

Gass, S., & Selinker, L. (2008). Second language acquisi-tion: An introductory course. New York/London:Routledge.

Han, Z. (2002). A study of the impact of recasts on tenseconsistency in L2 output. TESOL Quarterly, 36,543–572.

Harley, B., & Hart, D. (1997). Language aptitude andsecond language proficiency in classroom learnersof different starting ages. Studies in Second Lan-guage Acquisition, 19, 379–400.

Horwitz, E. (1980). The relationship of conceptual level to thedevelopment of communicative competence. (Unpub-lished doctoral dissertation). The Universityof Illinois at Urbana–Champaign, Urbana–Champagne, Illinois.

Horwitz, E. (1987). Linguistic and communicativecompetence: Reassessing foreign language apti-tude. In B. VanPatten, T. Dvorak, & J. Lee (Eds.),Foreign language learning (pp. 146–157). Cam-bridge, MA: Newbury House.

Hulstijn, J. (2005). Theoretical and empirical issues inthe study of implicit and explicit second-languagelearning: Introduction. Studies in Second LanguageAcquisition, 27, 129–140.

Hummel, K. (2009). Aptitude, phonological memory,and second language proficiency in nonnoviceadult learners. Applied Psycholinguistics, 30, 225–249.

Ishida, M. (2004). Effects of recasts on the acquisition ofthe aspectual form –te i-(ru) by learners of Japaneseas a foreign language. Language Learning, 54, 311–394.

Iwashita, N. (2003). Positive and negative input in task-based interaction: Differential effects on L2development. Studies in Second Language Acquisi-tion, 25, 1–36.

Krashen, S. (1981). Second language acquisition and secondlanguage learning. Oxford: Pergamon.

Leeman, J. (2003). Recasts and second languagedevelopment: Beyond negative evidence. Studiesin Second Language Acquisition, 25, 37–63.

Leeser, M. (2007). Learner-based factors in L2 readingcomprehension and processing grammaticalform: Topic familiarity and working memory.Language Learning, 57, 229–270.

Li, S. (2009). The differential effects of implicit andexplicit feedback on L2 learners of differentproficiency levels. Applied Language Learning, 19,53–79.

Li, S. (2010). The effectiveness of corrective feedback inSLA: A meta-analysis. Language Learning, 60, 309–365.

Li, S. (2014). The interface between feedback type, L2proficiency, and the nature of the linguistic target.Language Teaching Research.

Li, C., & Thompson, S. (1981). Mandarin Chinese: Afunctional reference grammar. Los Angeles: Universi-ty of California Press.

Liu, Y., Yao, T., Bi, N., Ge, L., & Shi, Y. (2009). IntegratedChinese (3rd ed.). Boston: Cheng&Tsui Company.

Long, M. H. (1996). The role of the linguisticenvironment in second language acquisition. InW. C. Ritchie & T. K. Bhatia (Eds.), Handbook oflanguage acquisition. Vol. 2: Second languageacquisition (pp. 413–468). New York: AcademicPress.

Long, M., Inagaki, S., & Ortega, L. (1998). The role ofnegative feedback in SLA: Models and recasts inJapanese and Spanish. Modern Language Journal,82, 357–371.

Lyster, R. (2004). Different effects of prompts andeffects in form-focused instruction. Studies inSecond Language Acquisition, 26, 399–432.

Lyster, R., & Izquierdo, J. (2009). Prompts versus recastsin dyadic interaction. Studies in Second LanguageAcquisition, 59, 453–498.

Lyster, R., & Mori, H. (2006). Interactional feedbackand instructional counterbalance. Studies in SecondLanguage Acquisition, 28, 269–300.

Lyster, R., & Ranta, L. (1997). Corrective feedback andlearner uptake. Studies in Second Language Acquisi-tion, 19, 37–66.

Mackey, A., & Goo, J. (2007). Interaction research inSLA: A meta-analysis and research synthesis. In A.Mackey (Ed.), Conversational interaction in SLA: Acollection of empirical studies (pp. 408–452). Oxford:Oxford University Press.

Mackey, A., & Philp, J. (1998). Conversational interac-tion and second language development: Recasts,responses, and red herrings? Modern LanguageJournal, 82, 338–356.

Mackey, A., Philp, J., Egi, T., Fujii, A., & Tatsumi, T.(2002). Individual differences in working memo-ry, noticing of interactional feedback, and L2development. In P. Robinson (Ed.), Individualdifferences and instructed language learning (pp. 181–209). Philadelphia/Amsterdam: John Benjamins.

Mackey, A., & Sachs, R. (2012). Older learners in SLAresearch: A first look at working memory, feed-back, and L2 development. Language Learning, 62,704–740.

McDonough, K. (2007). Interactional feedback and theemergence of simple past activity verbs in L2English. In A. Mackey (Ed.), Conversational interac-tion in second language acquisition (pp. 323–338).Oxford: Oxford University Press.

Miyake, A., & Friedman, N. (1998). Individual differ-ences in second language proficiency: Workingmemory as language aptitude. In A. Healy & L.Bourne (Eds.), Foreign language learning: Psycholin-guistic studies on training and retention (pp. 339–364). Mahwah, NJ: Lawrence Erlbaum.

652 The Modern Language Journal 97 (2013)

Nie, D. (2006). Test–retest reliability of HSK (Elemen-tary–Intermediate Level). China Examinations, 5,43–47.

Norris, J. M., & Ortega, L. (2000). Effectiveness of L2instruction: A research synthesis and quantitativemeta-analysis. Language Learning, 50, 417–528.

Polio, C. (1994). Non-native speakers’ use of nominalclassifiers in Mandarin Chinese. JCLTA, 29, 51–66.

Ranta, L. (2002). The role of language analytic ability inthe communicative classroom. In P. Robinson(Ed.), Individual differences and instructed languagelearning (pp. 159–180). Philadelphia/Amsterdam:John Benjamins.

Ranta, L. (2005). Language analytic ability and oralproduction in a second language: Is there aconnection? In A. Housen & M. Pierrard (Eds.),Investigations in instructed second language acquisition(pp. 99–130). Berlin: Mouton de Gruyter.

Revesz, A. (2012). Working memory and the observedeffectiveness of recasts on different L2 outcomemeasures. Language Learning, 62, 93–132.

Robinson, P. (1997). Individual differences and funda-mental similarity of implicit and explicit adultsecond language learning. Language Learning, 47,45–99.

Robinson, P. (2002). Effects of individual differences inintelligence, aptitude and working memory onadult incidental SLA: A replication and extensionof Reber, Walkenfield andHernstadt (1991). In P.Robinson (Ed.), Individual differences and instructedlanguage learning (pp. 211–266). Philadelphia/Amsterdam: John Benjamins.

Robinson, P. (2005). Aptitude and second languageacquisition.Annual Review of Applied Linguistics, 25,46–73.

Sagarra, N. (2007). From CALL to face-to-face interac-tion: The effect of computer-delivered recasts andworking memory on L2 development. In A.Mackey (Ed.), Conversational interaction in secondlanguage acquisition (pp. 229–248). Oxford: Ox-ford University Press.

Schmidt, R. (1990). The role of consciousness in secondlanguage learning.Applied Linguistics, 11, 129–158.

Sheen, Y. (2007). The effects of corrective feedback,language aptitude, and learner attitudes on theacquisition of English articles. In A. Mackey (Ed.),Conversational interaction in second language acquisi-

tion (pp. 301–322). Oxford: Oxford UniversityPress.

Sheen, Y. (2010). Differential effects of oral and writtencorrective feedback in the ESL classroom. Studiesin Second Language Acquisition, 32, 203–234.

Sheen, Y. (2011). Corrective feedback, individual differences,and second language learning. Berlin: Springer.

Skehan, P. (1982). Memory and motivation in languageaptitude testing. (Unpublished doctoral disserta-tion). University of London, London, UK.

Skehan, P. (2012). Language aptitude. In S. Gass & A.Mackey (Eds.), The Routledge handbook of secondlanguage acquisition (pp. 381–395). New York/London: Routledge.

Snow, R. (1991). Aptitude–treatment interaction as aframework for research on individual differencesin psychotherapy. Journal of Consulting and ClinicalPsychology, 59, 205–216.

Spada, N. (2011). Beyond form-focused instruction:Reflections on past, present and future research.Language Teaching, 44, 225–236.

Spada, N., & Tomita, Y. (2010). Interactions betweentype of instruction and type of language feature: Ameta-analysis. Language Learning, 60, 263–308.

Sparks, R. L., Humbach, N., Patton, J. O. N., &Ganschow, L. (2011). Subcomponents of Sec-ond-Language Aptitude and Second-LanguageProficiency.Modern Language Journal, 95, 253–273.

Trofimovich, P., Ammar, A., & Gatbonton, E. (2007).How effective are recasts? The role of attention,memory, and analytical ability. In A.Mackey (Ed.),Conversational interaction in second language acquisi-tion (pp. 171–195). Oxford: Oxford UniversityPress.

Waters, G., & Caplan, D. (1996). The measurement ofverbal workingmemory capacity and its relation toreading comprehension. Quarterly Journal of Exper-imental Psychology, 49A, 51–79.

Wu, Y., & Bodomo, A. (2009). Classifiers 6¼ determiners.Linguistic Inquiry, 40, 487–503.

Wu, S., Yu, Y., Zhang, Y., & Tian, W. (2007). Chinese link.Upper Saddle River, NJ: Pearson Education.

Yang, Y., & Lyster, R. (2010). Effects of form-focusedpractice and feedback on Chinese EFL learners’acquisition of regular and irregular past tenseforms. Studies in Second Language Acquisition, 32,235–263.

Shaofeng Li 653

APPENDIX

Measures Used in the Study

Measure Construct Items Points Reliability

HSK Proficiency 60 60 .85Treatment effect

• Grammaticality judgment Explicit knowledge 15 15 .74• Elicited imitation Implicit knowledge 15 15 .68

Part IV of MLAT Language analytic ability 45 45 .81Listening span test Working memory 72

• Reaction time 72 Average .98• Plausibility judgment 72 72 .80• Recall 72 72 .89

Note. HSK: Chinese proficiency test; listening span test: The workingmemory score used in the analyses for eachparticipant is the average of the z scores relating to the three components of the test; average: The reaction timefor each learner is the average of the reaction times relating to the items for which the plausibility judgmentswere correct; reliability: Cronbach’s a is used as the reliability coefficient, and reliability estimates relating tothe GJT and EI tests are based on the learners’ pretest scores.

654 The Modern Language Journal 97 (2013)

本文献由“学霸图书馆-文献云下载”收集自网络,仅供学习交流使用。

学霸图书馆(www.xuebalib.com)是一个“整合众多图书馆数据库资源,

提供一站式文献检索和下载服务”的24 小时在线不限IP

图书馆。

图书馆致力于便利、促进学习与科研,提供最强文献下载服务。

图书馆导航:

图书馆首页 文献云下载 图书馆入口 外文数据库大全 疑难文献辅助工具