Individual differences in the production of referential...

67
This is a repository copy of Individual differences in the production of referential expressions: The effect of language proficiency, language exposure and executive function in bilingual and monolingual children. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/135553/ Version: Accepted Version Article: Serratrice, L and De Cat, C orcid.org/0000-0003-0044-0527 (2020) Individual differences in the production of referential expressions: The effect of language proficiency, language exposure and executive function in bilingual and monolingual children. Bilingualism: Language and Cognition, 23 (2). pp. 371-386. ISSN 1366-7289 https://doi.org/10.1017/S1366728918000962 © 2019, Cambridge University Press. This article has been published in a revised form in Bilingualism: Language and Cognition https://doi.org/10.1017/S1366728918000962. This version is free to view and download for private research and study only. Not for re-distribution, re-sale or use in derivative works. [email protected] https://eprints.whiterose.ac.uk/ Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request.

Transcript of Individual differences in the production of referential...

  • This is a repository copy of Individual differences in the production of referential expressions: The effect of language proficiency, language exposure and executive function in bilingual and monolingual children.

    White Rose Research Online URL for this paper:http://eprints.whiterose.ac.uk/135553/

    Version: Accepted Version

    Article:

    Serratrice, L and De Cat, C orcid.org/0000-0003-0044-0527 (2020) Individual differences in the production of referential expressions: The effect of language proficiency, language exposure and executive function in bilingual and monolingual children. Bilingualism: Language and Cognition, 23 (2). pp. 371-386. ISSN 1366-7289

    https://doi.org/10.1017/S1366728918000962

    © 2019, Cambridge University Press. This article has been published in a revised form in Bilingualism: Language and Cognition https://doi.org/10.1017/S1366728918000962. This version is free to view and download for private research and study only. Not for re-distribution, re-sale or use in derivative works.

    [email protected]://eprints.whiterose.ac.uk/

    Reuse

    Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item.

    Takedown

    If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request.

    mailto:[email protected]://eprints.whiterose.ac.uk/

  • For Peer Review

    ������������

    �����������������������������������������������

    ����������������������������

    ����������������������������������������������������������

    �������� ����������������������������������

    ������������ �������������������

    ������������ � � � ���!������ �

    ��� �"�#$�� %�#���! ����!��� �&��

    ��$�� � ������'����!���� " ������ (���%�)��*�+�) ������'�� �%�,(�"�!�����'�-���!���,����%�����������,��, �"� �� ��� ����(��.�� *�+�) ������'�� %�(���,������/�-!�� ����

    ���� ����� ��� ����������,��, �� ) ���$ ���

    ��,������� ��� ������� (�-��,$����(�����!���&��%�,�

    �!�%�� � !�)������ ���� $ ���(�0''�� ����1��

    -���������� �!�%�" ���%����,��, �� ) ���$ ���

    ���,��, �� 2�,��!�

    ��

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

  • For Peer Review

    1

    Referential choice in bilingual and monolingual children

    Individual Differences in the Production of Referential Expressions: The Effect of Language Proficiency, Language Exposure and Executive Function in Bilingual and Monolingual Children* Ludovica Serratrice, University of Reading Cécile De Cat, University of Leeds *Acknowledgments This research was funded by a grant from the Leverhulme Trust (RPG-2012-633), which is gratefully acknowledged. Special thanks to Sanne Berends for leading on data collection and coding, and to Furzana Shah for assistance with the data collection. Many thanks to Arief Gusnanto for statistical consultancy, and to the many schools who opened their doors to our project, to the participating children for their enthusiasm and their efforts, and to their parents for filling in lengthy questionnaires. Address for correspondence: Ludovica Serratrice University of Reading School of Psychology and Clinical Language Sciences Harry Pitt Building Reading RG6 7BE [email protected]

    Page 1 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    2

    ���������

    One hundred and seventy-two English-speaking 5- to 7-year-olds participated in a

    referential communication task where we manipulated the linguistic mention and the

    visual presence of a competitor alongside a target referent. Eighty-seven of the

    children were additionally exposed to a language other than English (bilinguals). We

    measured children’s language proficiency, verbal working memory (WM), cognitive

    control skills, family SES, and relative amount of cumulative exposure and use of the

    home language for the bilinguals. Children’s use of full Noun Phrases (NPs) to

    identify a target referent was predicted by the visual presence of a competitor more

    than by its linguistic mention. Verbal WM and proficiency predicted NP use, while

    cognitive control skills predicted both the ability to use expressions signalling

    discourse integration and sensitivity to the presence of a discourse competitor, but not

    of a visual competitor. Bilingual children were as informative as monolingual

    children once proficiency was controlled for.

    Keywords: referential choice, anaphora, individual differences, cognitive control,

    gradient bilingualism

    Page 2 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    3

    ���������

    One of the core aspects of human communication revolves around the choice

    of linguistic expressions for referent identification, i.e. the use of proper names (e.g.

    Laura), Noun Phrases – NPs - (e.g. the girl, my sister, my sister’s car) and pronouns

    (e.g. she, them, someone) to talk about entities in the world. Adults, and, to some

    extent, preschool and school-age children are sensitive to a number of structural,

    semantic and discourse-pragmatic constraints when it comes to producing referential

    expressions in a communicative context (see Serratrice & Allen, 2015, for an

    overview of the acquisition of reference).

    Despite a general sensitivity to the aforementioned constraints, there are

    individual differences in the extent to which both adults and children rely on

    perspective-taking skills to process and produce referential expressions. Taking the

    perspective of a conversational partner requires the inhibition of one’s own

    perspective and the shifting to that of the addressee. Recent work on adult speakers

    (Ryskin, Benjamin, Tullis & Brown-Schmidt, 2015; Wardlow, 2013), and some

    emerging work in child and adolescent speakers (Nilsen & Graham, 2009; Nilsen,

    Varghese, Xu & Fecica, 2015; Torregrossa, 2017; Wardlow & Heyman, 2016), has

    identified executive function skills, particularly working memory (WM), and

    cognitive control, i.e. the ability to resolve a conflict by inhibiting an irrelevant

    response and promoting relevant information, as significant predictors of individual

    variation in referential communication success. The use of a referential expression

    implies a choice, for example a pronoun vs. a NP. This choice arises from the

    selection between different options and, at least in some cases, it is the outcome of the

    resolution of a conflict between competing alternatives. For example, if the speaker

    Page 3 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    4

    and the addressee have different levels of access to a target referent, their mental

    representations will not entirely overlap. The onus is on the speaker to inhibit a

    potentially egocentric perspective and promote an addressee-friendly perspective that

    will maximise the chances of convergence between the mental representations of both

    speaker and addressee. This can translate into choosing a more informative NP (e.g.

    the tall girl), as opposed to a more reduced and less informative expression (e.g. she).

    Because conflict monitoring and resolution depend on the inhibition of irrelevant

    information, the promotion of relevant information, or both, we will adopt the term

    cognitive control to include both the inhibition and the promotion aspects of the

    process (Teubner-Rhodes, Mishler, Corbett, Andreu, Sanz-Torrent, Trueswell &

    Novick, 2016).

    WM refers to the ability to store and manipulate information, and it has been

    connected to perspective-taking and referential choice in at least two ways. Firstly, it

    underpins the storage and updating of the interlocutor’s perspective and the

    comparison of that perspective with one’s own to check for convergence (Nilsen &

    Bacso, 2017; Wardlow, 2013). Secondly, it may be implicated in the use of feedback

    in the case in which one of the interlocutors explicitly signals a mismatch between

    their perspective and that of their conversational partner. Higher verbal WM capacity

    has been shown to correlate positively with 5- and 6-year-olds ability to use an adult’s

    non- verbal feedback to produce a discourse-appropriate referential expression

    (Wardlow & Heyman, 2016).

    A parallel line of research has singled out bilingual speakers – both older

    adults and children - as having an advantage in the same executive function skills of

    cognitive control that are associated with referential choice (Bialystok & Martin,

    2004; Morales, Calvo & Bialystok, 2013). Whether bilinguals genuinely have

    Page 4 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    5

    superior WM skills compared to monolinguals, or not, is, however, not yet clear.

    Some studies report no difference between bilingual and monolingual children

    (Barbosa, Jiang & Nicoladis, 2017; Bialystok, Luk, and Kwan 2005; Engel de Abreu,

    2011), others report an advantage for bilingual children (Morales, Calvo, & Bialystok,

    2013).

    In the present study we combine these two independent lines of inquiry to

    investigate how degrees of exposure to/and use of English and another home

    language, language proficiency in English, and executive function skills (cognitive

    control and verbal WM), predict the choice of linguistic expressions in a referential

    communication task in monolingual and bilingual children between the ages of 5 and

    7. In the task we manipulated a linguistic factor (the discourse mention of a

    competitor to the target referent) and a non-linguistic factor (the visual presence of a

    competitor to the target referent) to provide new evidence on the sources of contextual

    information used by children in reference production. Previous work has focused on

    children’s use of deictic expressions in referential communication tasks (e.g. Nilsen &

    Graham, 2009), while we were specifically interested in children’s use of anaphoric

    expressions to refer to a previously mentioned antecedent.

    Research including bilingual children has sometimes neglected to take into

    account the SES profile of participants. This is an important limitation as SES is

    known to be predictive of both language and of cognitive skills. In the present study

    we therefore included a measure of SES in our analyses.

    ������������������������

    Adult speakers are sensitive to a number of structural and discourse-pragmatic

    constraints in their referential choices. They tend to use more pronouns for referents

    that are in subject position (Arnold, 2001) and/or in sentence-initial position

    Page 5 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    6

    (Järvikivi, van Gompel, Hyönä & Bertram, 2005), or for referents that are topics

    (Anderson, Garrod & Sanford, 1983). Conversely, competent speakers tend to use

    more informative referential expressions (e.g. proper names and indefinite NPs) when

    the referent is new to the discourse (Gordon, Hendrick, Ledoux & Yang, 1999), or

    when the use of a pronoun might lead to potential ambiguity (Arnold, 2008). Adult

    speakers generally can take the perspective of their listener into account, and they

    choose their referential expressions accordingly. Perspective-taking is predicated

    upon the ability to distinguish between what is in the common ground (Clark, 1992),

    and therefore shared knowledge between speaker and listener, and what is in the

    privileged ground, i.e. knowledge that is only accessible to the speaker. The common

    ground can either be established perceptually, i.e. when it includes referents that are

    visually accessible to both interlocutors, and/or it can be established linguistically via

    the use of discourse-appropriate referential expressions.

    Competent adult speakers typically engage in modelling their addressee’s

    perspective to produce a referential expression that is optimal for their conversational

    partner (Hendriks, Englert, Wubs & Hoeks, 2008). In essence the assumption is that

    competent speakers maintain their onw mental representation of their addressee’s

    mental representation. However, the extent to which these meta-representations

    always require an effortful and intentional commitment on the part of the speaker, and

    whether they necessarily rely on explicit Theory of Mind skills, is debated in the

    literature (Horton & Brennan, 2016).

    Even before they have a fully developed Theory of Mind, three-year-olds are

    already at least partly sensitive to the same constraints that regulate referential choice

    in adult speakers (see Allen, Hughes & Skarabela, 2015, for a review). Pre-school

    children are more likely to omit arguments, or use reduced expressions, when they are

    Page 6 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    7

    part of the common ground either through joint attention (Skarabela, 2007), previous

    linguistic mention (Allen & Schröder, 2003; Clancy, 2003; Guerriero, Oshima-

    Takane & Kuriyama, 2006; Stephens, 2015), or prior mention and/or perceptual

    availability (Campbell, Brooks & Tomasello, 2000; De Cat, 2011; Matthews, Lieven,

    Theakston & Tomasello, 2006; Rozendaal & Baker, 2010; Salazar Orvig et al., 2010a;

    Salazar Orvig et al., 2010b).

    At the same time, children are notoriously less capable than adults when it

    comes to taking their listener’s perspective into account and to adjusting their

    referential choices accordingly. This has been observed in production studies in pre-

    schoolers (De Cat, 2011, 2015), in five-year-olds (Theakston, 2012), and in six-years-

    old (Serratrice, 2008) when children need to provide a referential expression, and up

    to adolescence in comprehension where participants need to make a choice between

    potential referents (Dumontheil, Apperly, & Blakemore, 2010).

    �������������������������

    ����������������������������������������

    It is becoming increasingly apparent that there are individual differences in the

    degree of perspective-taking abilities, and that this variation may correlate with the

    ability to interpret referential expressions in discourse-pragmatic appropriate ways

    (Brown-Schmidt, 2009; Lin, Keysar & Epley, 2010; Ryskin et al., 2015). Studies on

    adults have focused on the relationship between perspective-taking abilities - indexed

    by referential choice - and cognitive control and WM - two core components of

    executive function. There is some additional evidence that cognitive control also

    plays a role in perspective-taking and referential interpretation in pre-school children.

    In two referential communication studies with three- and five-year-olds, Nilsen and

    Graham (2009) reported that performance on a cognitive control task significantly

    predicted comprehension accuracy for both the younger and the older children.

    Page 7 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    8

    However neither WM nor cognitive control were predictive of accuracy in a

    production task in which the five-year-olds had to provide a disambiguating adjective

    to identify a referent in the privileged ground condition. Nilsen and Graham (2009)

    speculated that this non-significant finding could be due to the fact that their measure

    for assessing children’s perspective taking (i.e. the number of adjectives in the

    common ground condition) was not sufficiently sensitive to reveal the impact of

    cognitive control.

    Some of the adult studies point to a positive correlation between cognitive

    control skills and perspective-taking abilities in the online interpretation (Brown-

    Schmidt, 2009; Lin et al., 2010) and production of referential expressions (Wardlow,

    2013), but others have failed to replicate this finding with monolingual and bilingual

    adults in a spatial perspective-taking task (Ryskin, Brown-Schmidt Canseco-

    Gonzalez, Yiu, & Nguyen, 2014), and with children with ADHD in a referential

    communication task (Nilsen, Mangal & Macdonald, 2013).

    Verbal WM (WM) has also been recently linked to individual differences in

    perspective-taking skills in the production of referential expressions in monolingual

    adults (Wardlow, 2013). Referential choice requires the speaker to focus on those

    conceptual features that make the target different from potential competitors that may

    or may not be accessible to the addressee. This evaluation process relies on the

    storage in memory of the features of the target and it additionally requires a

    comparison with the features of the competitors. This is a complex set of operations

    that involve both the storage and the manipulation of information. In essence these

    demands are comparable to those of a WM task where the information must be

    retained in memory while being subjected to additional operations. Adopting a

    computational modelling approach, Hendriks (2016) has argued for individual

    Page 8 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    9

    differences in WM capacity and processing speed as predictors of informativity in

    referential choice. Hendriks (2016) reports on a series of computational simulations

    where the manipulation of WM capacity in the network led to significant differences

    in the use of pronouns vs. NPs to refer back to a potentially ambiguous antecedent

    (van Rij, 2012). In the low WM model there was a significantly higher proportion of

    underspecified and underinformative pronouns than in the high WM model where

    more pragmatically adequate NPs were used.

    The role of verbal WM has not yet been explored in connection with

    referential choice in bilingual children. In monolingual children, Nilsen and Graham

    (2009) did not find WM to be predictive, possibly because of the relatively low task

    demands, but Wardlow and Heyman (2016) found it to be positively correlated with

    5- and 6-year-olds’ ability to benefit from adult non-verbal feedback in a referential

    production task. Children with higher WM improved their use of discourse-

    appropriate referential expressions in the course of the experiment when they received

    feedback that they were being uninformative. In a sample of monolingual German-

    speaking 8- to 10-year-olds Torregrossa (2017) also found a positive correlation

    between WM - indexed by backward-digit-span scores - and the discourse-

    appropriate use of demonstrative pronouns in a story-telling task pronouns. In the

    light of Wardlow’s (2013) preliminary findings with adult speakers, Torregrossa’s

    (2017) findings with 8- to 10-year-olds, and the results in the feedback condition for

    the 5- and 6-year-olds in Wardlow and Heyman’s (2016) study, it is theoretically

    interesting to test whether the relationship between choice of referring expressions

    and verbal WM generalizes to bilingual child speakers

    ���������������������������������������� �

    A parallel but independent line of research has shown, albeit not

    Page 9 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    10

    uncontroversially (see Valian, 2015), that cognitive control is one area in which

    bilinguals may have an advantage over monolinguals (Bialystok, 2015). If bilingual

    children do have an advantage when it comes to inhibiting information that is in their

    privileged ground and promoting information in the common ground, and if this kind

    of cognitive control is conducive to referential communication, it follows that

    bilingual children should, in principle, be more successful in choosing discourse-

    appropriate linguistic expressions in a referential communication task that requires

    cognitive control. To date no studies have directly investigated whether individual

    differences in cognitive control and WM confer an advantage to young bilinguals

    when it comes specifically to referential choice. The literature on referential

    expressions in bilingual children and adults has principally focused on the issue of

    cross-linguistic influence, and on whether the interpretation of third person pronouns

    is affected in a null-subject language when the other language has obligatory overt

    subjects (Serratrice & Hervé, 2015). More recently some studies with infants and

    young children have reported a bilingual advantage for sensitivity to referential cues

    (Fan, Liberman, Keysar, & Kinzler, 2015; Liberman, Woodward, Keysar, & Kinzler,

    2017)

    Although superior cognitive control skills may put bilingual children in a

    privileged position in terms of perspective-taking and referential choice, other factors

    must also be considered as predictors of discourse-appropriate linguistic choices. The

    bilingual language experience is, by its very nature, distributed across language, and –

    at least in relative terms - bilingual children receive proportionally less input in each

    language that monolingual children. Although relative amount of exposure is only an

    indirect and imperfect approximation of input quantity (Carroll, 2017; De Houwer,

    2014; Hurtado, Grüter, Marchman & Fernald, 2014), it has repeatedly been shown to

    Page 10 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    11

    correlate robustly with measures of language proficiency (Hoff, Welsh, Place &

    Ribot, 2014; Unsworth, 2013).

    It is plausible to expect a positive correlation between overall language skills

    and the ability to select discourse-appropriate referring expressions. Hence, whatever

    advantage superior cognitive control skills might confer to bilinguals when it comes

    to referential choice – if any – it may be offset by lower language proficiency when

    compared to monolingual children. Ryskin et al. (2014) make a similar claim to

    account for the lack of a bilingual advantage in a spatial perspective-taking task with

    adults. Some evidence that language proficiency may play a role comes from a

    referential communication study (Fan, Liberman, Keysar, & Kinzler, 2015) which

    also included measures of language proficiency (receptive vocabulary), cognitive

    control, and fluid intelligence, in a group of monolingual 5-year-olds and two groups

    of age-matched children who were either bilingual, or exposed to a multilingual

    environment. The only significant effect was that of group with both the bilingual and

    multilingual exposure children outperforming the monolinguals. Crucially the three

    groups did not differ in terms of receptive vocabulary, and therefore it remains to be

    seen whether bilinguals with lower language skills than monolinguals might be

    adversely affected in a linguistic task.

    Another variable that may potentially affect children’s linguistic and cognitive

    performance is SES. SES is a complex construct and it is considered a proxy for

    access to a range of economic, educational and occupational resources (Hauser &

    Warren, 1997; McLoyd, 1998). Although there is a vast and expanding literature on

    the relationship between SES and language and cognitive development, attributing a

    causal role to SES in child development is not straightforward because SES is a

    multifaceted notion and so are language and cognition (Duncan & Magnuson, 2012).

    Page 11 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    12

    For example, SES has been shown to affect vocabulary size but not utterance length

    (Hoff-Ginsberg, 1998), grammar but not pragmatic development (Wells, 1986), and

    the effects are greater for expressive than receptive vocabulary (Snow, 1999).

    In monolinguals the complex relationship between linguistic and cognitive

    development and SES is well documented (Hackman & Farah, 2009; Hackman,

    Gallop, Evans & Farah, 2015). When it comes to bilingual children, there is inevitably

    an added layer of complexity. In bilingual populations SES also has a predictive role

    on language and cognitive skills, although it is not often easy to tease apart the

    relative contribution of bilingualism and SES. In many studies there are significant

    cultural differences between the bilingual and the monolingual groups, and the

    immigrant status of the bilinguals may present an additional confound. A number of

    studies have recently tried to disentangle SES from bilingualism (Calvo & Bialystok,

    2014; Carlson & Meltzoff, 2008) and the main finding seems to be that both

    bilingualism and SES independently account for the variance observed in linguistic

    and cognitive tasks. The relationship between SES, bilingualism, and language and

    cognitive performance is however complex (Gathercole, Kennedy & Thomas, 2015)

    and is mediated by language exposure, age and the specific aspect of language (e.g.

    vocabulary vs. grammar), or of non-verbal cognition being tested.

    ����������������

    To date, the relationship between perspective-taking skills, cognitive control,

    verbal WM, and referential choice has mostly been studied in the context of online

    comprehension. Studies investigating the predictive role of executive function skills

    in production have reported mixed results (Nilsen & Graham, 2009; Wardlow Lane,

    2013; Ryskin et al., 2015; Torregrossa, 2017; Wardlow & Heyman, 2016).

    Page 12 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    13

    The first aim of the present study is to test whether cognitive control, as

    measured by the Simon task, and verbal WM, as measured by backward digit recall,

    are predictive of referential choice in a production task in which child participants

    need to build a complex situation model and identify a target referent in settings in

    which we manipulate the presence of discourse and visual competitors. The prediction

    is that the Simon task score and the backward digit recall score will correlate

    positively with the informativeness of the participants’ referential choices.

    The second aim of the present study is to investigate the contribution of language

    experience to perspective-taking abilities and referential choice. English-speaking

    monolingual children and bilingual children with varying degrees of exposure to a

    language other than English (henceforth the home language) are therefore included in

    the study. Language experience is conceptualized here both in terms of cumulative

    amount of exposure and use of the home language (Bilingual Profile Index, BPI, De

    Cat, Gusnanto & Serratrice, 2017; De Cat & Serratrice, under review), and in terms of

    language proficiency as measured by the Articles sub-test of the Diagnostic

    Evaluation of Language Variation (Seymour, Roeper & de Villiers, 2003), a dialect-

    neutral assessment for 4- to 9-year-olds, that minimizes the effects of language

    exposure differences in bilingual and bicultural children. We expect that children with

    better language proficiency – which is in turn likely to be predicted by the amount of

    exposure and use of English – will be more sensitive to the presence of discourse and

    visual competitors. It is also conceivable that language experience and language

    proficiency would interact, such that bilingual children might display an advantage

    only if their English proficiency falls within the range of their monolingual

    counterparts – as shown by Fan et al. (2015).

    Finally, studies of perspective-taking skills have typically investigated the

    Page 13 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    14

    comprehension and use of NPs containing disambiguating size or colour adjectives

    (e.g. the small duck, the red square) that directly pick out an entity in a visual display

    and are therefore not anaphoric (e.g. Nilsen & Graham, 2009; Wardlow & Heyman,

    2016). In contrast, in the present study we are focusing on the use of anaphoric

    expressions, i.e. third person pronouns vs. NPs, and on how the discourse and visual

    contexts determine the choice of a referential expression for a target referent in the

    presence of one or two antecedents that may be either visually present, linguistically

    mentioned, both, or neither.

    The experiment is modelled on the studies in Fukumura, van Gompel and

    Pickering (2010) with monolingual adult participants where they manipulated the

    linguistic mention and the visual presence of a competitor to a target referent.

    Although Fukumura et al. (2010) did not address this issue, the use of an NP in

    conditions in which a pronoun is ambiguous should – at least partly – be predicted by

    cognitive control and verbal WM. Those participants that are more successful at

    inhibiting their egocentric perspective, and have better WM resources to deal with a

    complex scene, should be those that are sensitive to the presence of a discourse and

    visual referent that is in competition with the target.

    Our prediction is that, if - similarly to adults – children are sensitive to both

    the linguistic and the non-linguistic features of the context in creating a discourse

    model, they will produce more informative referential expressions, i.e. full NPs (e.g.

    the princess, the cowboy) when the competitor is previously mentioned and when it is

    visually present.

    SES will be included as a predictor in the analyses alongside measures of

    language proficiency, language exposure and use, cognitive control and verbal WM,

    to assess the contribution that these child-internal factors might make to the use of

    Page 14 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    15

    anaphoric expressions in a demanding language production task.

    ��������

    !����������

    After receiving ethical approval for the study by the University Research Ethics

    Committee of the second author’s institution, children were recruited in state primary

    schools in the North of England. The final sample included 172 children attending

    year 1 or year 2 of primary school (between the ages of 5 and 7), all of whom were

    schooled exclusively in English. Half of the children (N = 87) were also exposed to a

    language other than English at home; these children will be referred to as bilinguals.

    In this study we adopted a broad definition of bilingualism that reflects the typical

    situation of many classrooms in the UK where children are classified as learners of

    English as an Additional Language (EAL) if ‘a first language, where it is other than

    English, is recorded where a child was exposed to the language during early

    development and continues to be exposed to this language in the home or in the

    community.’ (DfE School Census Guide 2016-2017, p.63). Because of this

    inclusionary criterion, the children in our bilingual group had a wide range of

    exposure (as low as 9%) to 28 different home languages: Punjabi (21% of bilingual

    participants), Urdu (17%), Arabic (9%), French (8%), Spanish (6%), Bengali,

    Cantonese, Catalan, Dutch, Farsi, Greek, Hindi, Italian, Kurdish, Mandarin, Marathi,

    Mirpuri, Nepalese, Pashto, Polish, Portuguese, Shona, Somali, Swedish, Tamil,

    Telugu, Thai, Tigrinya (languages with no percentage indicator accounted for less

    than 5% of the sample). Our bilingual group was therefore deliberately heterogeneous

    to capture the variability of children who are currently considered as bilingual (EAL

    Page 15 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    16

    learners) in multilingual classrooms in the UK, and to capitalise on the notion of

    bilingualism as a continuous measure.

    ������

    In addition to the main referential communication task that is the object of this

    study, we collected information on the children’s SES, on their exposure and use of

    English and the home language, and tested their proficiency in English, their verbal

    WM and their cognitive control skills.

    Socio�economic Status (SES). The children came from schools in a range of

    different catchment areas to ensure variation in SES. We collected information on

    parental education and occupation via questionnaires. Children were allocated an SES

    score on the basis of the highest level of occupation or education in the household

    (either mother or father). Education was coded on a five-point scale (none, primary,

    secondary, further, university), and the occupational data was coded according to the

    reduced method of the UK National Statistics socio-economic classification. We used

    the reversed occupational data scores to make the interpretation of the association

    with the educational level data more transparent, so that a higher value represents an

    advantage. As expected there was a strong association between the two measures (Χ2

    (4, N = 174) = 83.57, p < 0.0001). We also found a weak but significant negative

    correlation between level of bilingualism as measured by the children’s cumulative

    amount of exposure and use measured by the Bilingual Profile Index - as described

    below- and SES as measured by parental occupation (r = −.25, p = 0.0009).

    Language exposure and use. We used a parental questionnaire to estimate the

    bilingual children’s relative amount of exposure and use of English and of the home

    language. The questionnaire, which includes both current and cumulative estimates of

    the amount of exposure and use, is modelled on the BiLEC (Unsworth, 2013). The

    Page 16 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    17

    parents (usually the mother) completed the questionnaire in English, Bengali, Punjabi

    or Urdu with the help of a bilingual assistant. They were asked to quantify the amount

    of their child’s current exposure and use of the two languages on a typical school day,

    at weekends, and during holiday periods. School days were divided into slots of one

    hour before and after school during which children were exposed predominantly to

    English. It is possible that children may have used the home language with some

    same-language peers at school but because parents – and not teachers – were asked to

    complete the questionnaire, we did not have access to this information and we

    conservatively assumed that during school hours children only heard and used

    English. Parents were asked about all of the child’s interlocutors, and to estimate on a

    five-point scale how often they addressed the child in the home language (never,

    rarely, half of the time, usually, always). We later converted the scores into discrete

    percentage bands ranging from 0 (never) to 100% (always). Parents were also asked

    to recall age of first exposure to English. To calculate the current relative amount of

    exposure to English and the home language for a given child we extrapolated the

    number of hours that the child spends with each interlocutor on a yearly basis, and we

    multiplied this figure for the percentage of time the child used either English or the

    home language with each interlocutor. The percentages for each of the child’s

    interlocutors were added and then divided by the total number of hours of interaction

    pooled for all interlocutors, if several interlocutors were present at the same time, the

    estimate was divided by the number of interlocutors for the relevant time window.

    The resulting was a percentage expressing the relative amount of input for English

    and the home language. We applied the same method to the calculation of a relative

    measure of child’s output, i.e. use of English or the home language. For the

    cumulative amount of input/output in each language we firstly calculated the number

    Page 17 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    18

    of months of home language use only, i.e. before children were exposed to English –

    this was 0 for the simultaneous bilingual children – we then multiplied the number of

    months of bilingual exposure by the proportion of current input/output. The resulting

    figure is the total number of months equivalent to full-time exposure to the home

    language.

    The use of parental questionnaires to collect information on quantity and

    quality of child-directed input has obvious limitations and has lately come under

    critical scrutiny (Carroll, 2017). Although we acknowledge the constraints of this data

    collection method, we are also confident that it is a pragmatic solution whose validity

    and robustness have been repeatedly confirmed (De Houwer, 2017; Paradis, 2017).

    Current and cumulative measures of input and output in the home language

    were highly correlated in our sample (current input and output: r = .90, p < 0.0001;

    cumulative input and output: r = .95, p < 0.0001). Because we wanted to use both

    dimensions of the language experience as predictors in our analysis but needed to

    avoid collinearity for modelling purposes, we used Principal Component Analysis

    (PCA) to decorrelate the two measures and create a composite score of cumulative

    input and output which we call the Bilingual Profile Index (BPI, De Cat et al., 2017;

    De Cat & Serratrice, under review). The PCA of cumulative input and cumulative

    output yielded two principal components, the first of which captured 98% of the

    variability (given the strength of the correlation between the two cumulative

    measures). The BPI scores correspond to the loadings of that first component,

    reversed (so that a higher score corresponds to more experience in the home

    language) and aligned with a score of 0 for monolinguals. The BPI can be interpreted

    as a cumulative and gradient measure of a bilingual child’s experience of their home

    language, effectively close to the number of full-time months of exposure corrected

    Page 18 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    19

    for any imbalance between exposure and use. The range of the BPI in our sample is

    from 0 to 96.

    Language proficiency. We used the Articles sub-test of the Diagnostic

    Evaluation of Language Variation - DELV (Seymour et al., 2003) as a measure of

    language proficiency in English, the language of schooling. The DELV is a language

    assessment of syntax, semantics, pragmatics and phonology for children between the

    ages of 4 and 9. This test was specifically developed to neutralize dialectal differences

    and it focuses on language structures that are common to all children from English-

    speaking backgrounds regardless of the particular variety of English they speak. We

    chose the Articles sub-test as an independent measure of language proficiency as it

    taps into some of the same discourse-pragmatic skills that are required for the

    appropriate use of referential expressions.1

    Verbal working memory (WM). We used the Backward Digit Span task from

    the Wechsler Intelligence Scales for Children (Wechsler, 1991) as a proxy measure

    for children’s verbal WM capacity. The backward digit span was administered

    according to the WISC-IIIUK instructions: for each digit span the experimenter

    administered two trials, regardless of whether the first trial was passed or failed, and

    discontinued the test after failure on both trials of any item. Backward digit recall is

    one of three complex memory span measures (the other two being listening recall and

    counting recall) that in a confirmatory analysis were shown to load onto one single

    1 Performance in this proficiency task is significantly correlated with performance on

    other language proficiency measures collected as part of our larger study including

    the School-Age Sentence Imitation Task (Marinis, Chiat, Armon-Lotem, Gibbons &

    Gipps, 2010). See De Cat & Serratrice (under review, https://osf.io/wkgv7/) for

    details.

    Page 19 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    20

    factor by Gathercole, Pickering, Ambridge and Wearing (2004). Unlike forward digit

    recall, which only requires the storage and immediate recall of a sequence of spoken

    items and taps into the phonological loop, backward digit recall implies both the

    phonological loop, for the storage of items, and the central executive, for the

    additional processing in the reversing of the digits.

    Cognitive control. Children were administered a computer-based version of

    the Simon task (Simon & Wolf, 1963) programmed and run via E-Prime. The Simon

    task is considered a complex response inhibition task (Garon, Bryson & Smith, 2008).

    because it involves moderate WM demands in addition to the inhibition of a prepotent

    response. Participants need to hold a rule in mind (press the left button when you see

    x, press the right button when you see y), respond according to this rule (physically

    press the key), inhibit a prepotent response when the rule changes and respond

    accordingly (press left button when you see y, press the right button when you see x).

    The Simon task is one of many complex inhibition tasks that have been used

    in the developmental literature to measure children’s ability to inhibit a prepotent

    response while responding to a salient conflicting response option (see Garon et al.,

    2008 for a comprehensive review). With specific reference to the bilingual-

    monolingual comparison, previous studies have shown that bilingual children

    outperform monolingual peers only in tasks that assess the interference suppression

    component of cognitive control (Bialystok & Shapero, 2005; Qu, Low, Zhang, Li &

    Zelazo, 2016), but not in tasks that assess response inhibition alone (Martin-Rhee &

    Bialystok, 2008).

    Children sat in front of a 15.6” computer screen and used an E-Prime serial

    response button box with colour-coded buttons (red on the left and green on the right).

    Children started with 8 practice trials followed by 48 test trials; there was no neutral

    Page 20 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    21

    condition in which the coloured square would appear in the middle of the screen.

    Accuracy and Reaction Times (RTs) were automatically recorded by E-Prime. The

    index of cognitive control abilities used as a predictor in the present study

    corresponds to the modelled score in the Simon task, i.e. children’s score adjusted for

    age, SES, bilingual experience (indexed by the BPI), and accuracy at the previous

    trial.2 These correspond to the significant predictors of a Cox Proportional Hazard

    regression analysis, as reported in detail in De Cat et al. (2017). The Cox PH model

    captures response accuracy and speed within the same analysis, so the resulting score

    combines both aspects of children’s performance.

    Table 1 provide descriptive statistics for the monolingual and bilingual groups:

    Insert Table 1 here

    ���������������"���������

    Following the design of the studies in Fukumura et al. (2010), the experiment

    manipulated the visual presence and the linguistic mention of a competitor to a target

    referent in a 2x2 design in four conditions: competitor present and mentioned,

    competitor present and not mentioned, competitor absent and mentioned, competitor

    absent and not mentioned. There were five items in each of the four conditions and

    ten filler items. Each experimental item consisted of a set of two coloured

    photographs of iconic Playmobil characters (e.g. fireman, cowboy, ghost, queen),

    while the fillers included coloured geometric shapes and animals. Both the first and

    the second photograph in the experimental set always included the target referent (e.g.

    a fireman). In the competitor present conditions another referent of the same gender

    2 The modelled score was obtained using the predict function of the survival package

    in R (version 2.38.3), which was used for the analysis.

    Page 21 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    22

    also appeared in both photographs (e.g. a fireman and a pirate). Half of experimental

    items contained characters of feminine gender, and the position of the target and the

    competitor was counterbalanced throughout the experiment.

    See Figures 1 and 2 for examples of experimental items in the competitor

    visually present or absent conditions, and the Appendix for a full set of experimental

    and filler items.

    Insert Figure 1 and Figure 2 here

    The first photograph in each set was presented alongside a digitally recorded

    sentence spoken by a female native speaker of Northern British English. The sentence

    was a passive whose subject contained a genitive phrase where the possessor was the

    animate target referent and the possessum was an inanimate entity (e.g. The fireman’s

    bed has been made). In the conditions in which the competitor was mentioned it

    appeared in the passive’s by-phrase (e.g. The fireman’s bed has been made by a

    pirate).

    The rationale for embedding the target referent as the possessor in a genitive

    phrase (e.g. The fireman in The fireman’s bed) was to reduce its accessibility and thus

    generally decrease the likelihood that participants would only ever use pronouns in

    their continuation. It also allowed us to tease apart sentence-initial position from

    topichood. Like Fukumura et al. (2010) we also wanted to ensure that the bias

    towards using a pronoun for a highly salient subject antecedent would not completely

    obliterate the role of the visual context. The photographs were embedded in a

    PowerPoint presentation. The second picture appeared after the first had disappeared

    off the screen and was accompanied by the pre-recorded prompt “And now…“.

    !�����

    The children were tested on school premises. Two female experimenters took

    Page 22 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    23

    part in the task; experimenter A sat next to the participant; the participant sat in front

    of a laptop computer and the two were separated by a divider so they could not see

    what the other was looking at but they could see each other. Experimenter B

    introduced the task to the participant as a communication game and explained that the

    aim was to give instructions to experimenter A so that she could re-create the scenes

    in the child’s pictures with the toys that she was given by experimenter B.

    Experimenter B pressed the space bar on the child’s laptop on each trial to start the

    experiment and to move on to the next item. Before the experiment started there were

    two practice trials with feedback. No children had to be discarded for not

    understanding the task. At the start of each trial experimenter B pressed the space bar

    and the first picture appeared on the computer screen accompanied by the pre-

    recorded linguistic description (e.g.. “The fireman’s bucket has been filled (by a

    musician)”) lasting an average of 4000 ms. The space bar was pressed again at the

    end of the sentence and the target picture would appear accompanied by the prompt

    “And now…”. This was the participant’s cue to start giving directions to experimenter

    A to arrange the toys to recreate the scene that the child would describe (e.g. And now

    the fireman/he/the man is carrying the bucket). Experimenter A had the same toys

    that were present in the child’s picture. When the participant had completed their

    instruction they looked round the divider to see whether the experimenter’s toy

    arrangement matched the photograph on their computer screen. The experimenter

    remained in their seat, they showed the participant their toys and asked “Like that?”.

    Whenever the participant used an under-informative pronoun, experimenter A always

    chose the competitor to give the participant indirect feedback about their level of

    underinformativity.

    ���������������������

    Page 23 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    24

    Participants’ instructions to experimenter A were digitally recorded and

    transcribed using CHAT for CHILDES (MacWhinney, 2000); utterances were later

    imported into Excel and coded for the following features: mention of target referent

    (1= target referent; 0 = competitor); label used (repeated name from the preamble

    sentence, e.g. the king; an alternative label in the same semantic field– e.g. the prince

    instead of the king; an alternative label that only matched the referent in gender, e.g.

    the man instead of the king, the lady instead of the dentist); discourse integration (1=

    pronouns and definite NPs anaphorically referring to the target referent- e.g.

    he/she/the queen; 0 = indefinite pronouns – e.g. somebody – and indefinite NPs – e.g.

    a man - that do not make clear anaphoric reference to the target).

    The “discourse integration” coding operates a binary distinction between

    anaphoric and non-anaphoric expressions; the “label used” coding provides a more

    fine-grained distinction within different types of anaphoric referential expressions.

    While the king, the prince, the man are all definite NPs, they vary along a continuum

    of disambiguating information. We deliberately chose stereotypical and easily

    identifiable referents for the experimental items (i.e. king, fireman, astronaut, queen,

    nurse, etc.). To be maximally informative in the task, participants should ideally have

    used the label that was provided in the preamble description associated with the first

    photograph in the experimental pair. Using a different and less informative label

    might lead to potential ambiguity that would, in turn, increase as a function of the

    label’s lack of informativeness. So, in the case of a label in the same semantic field

    (e.g. prince instead of king) the likelihood of ambiguity would not be as high as in the

    case of a highly underspecified definite NP like the man that would give experimenter

    A only a vague cue to select the appropriate target toy to reconstruct the scene, and

    would be just as underinformative as a third person or an indefinite pronoun.

    Page 24 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    25

    �������

    Table 2 provides descriptive statistics for the results of the DELV Articles

    sub-test (language proficiency), the backward digit recall task (verbal WM) and the

    Simon task (cognitive control) for the monolingual and the bilingual groups. Note that

    the scales are different for the three measures. For the DELV, it is accuracy

    proportion from 0 to 1; for the backward digit recall it is the number of accurately

    recalled digits from 0 to 4 (as a score), and for the Simon task it is an index of

    cognitive control adjusted for age, SES, bilingual experience and accuracy at the

    previous trial; negative scores indicate better cognitive control skills.

    Insert Table 2 here

    A linear regression model fitted using the lme4 package (version 1.1.11) in R

    (version 3.2.4) to the overall score in the DELV Articles sub-test showed that

    performance was negatively correlated with the BPI (t(168) = -2.90; p = 0.004); as

    expected, bilingual children performed more poorly than monolinguals overall,

    greater exposure and use of the home language was correlated with lower proficiency

    scores. There was no significant effect of the BPI in the verbal WM task (t(181) = -

    0.29; p = 0.77). For the Simon task the results of a Cox-P Regression model showed

    a near-significant effect of group (X2(1) = 3.8, p = 0.05) and a significant effect of

    home language experience over and above the effect of group, as the BPI was a

    positive predictor (X2(1) = 12.13, p = 0.0005). There was however no significant

    interaction between bilingualism and cue congruency, and hence no Simon effect in

    the strict sense (in line with previous studies).

    We conducted three analyses to address the role of cognitive control, verbal

    WM, cumulative home language exposure and use, SES, and language proficiency on

    Page 25 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    26

    the children’s use of referential expressions. In the first analysis, following Fukumura

    et al. (2010), our DV only included exact repetitions of the target referent named in

    the context sentence vs. the use of third person pronouns. Two further analyses were

    necessary to capture the broader picture. In the second analysis, we included all

    referential expressions that made anaphoric reference to the target and investigated

    their informativeness by creating a binary DV: (1) underinformative expressions:

    third person singular pronouns (e.g. he/she) and underinformative definite NPs – e.g.

    the man instead of the king, the lady instead of the queen; and (2) definite NPs that

    were either exact repetition of the definite NP in the preamble sentence, or

    semantically related labels (e.g. the prince instead of the king, the singer instead of

    the musician).

    The third analysis identifies the factors that predict lack of discourse

    integration. We used a two-way distinction between indefinites signalling a lack of

    anaphoric discourse integration (i.e. indefinite NPs and indefinite pronouns), and

    pronouns and definite NPs that made anaphoric reference to the target.

    We fitted generalized linear mixed models using the lme4 package (version

    1.1.15) in R (version 3.4.4). The models were fitted incrementally by adding

    predictors one by one and retaining them only if they improved the model fit, yielding

    a significant reduction in AIC and a significant R-squared value, with model

    comparison estimated by likelihood ratio tests (Baayen, 2008). In each of the three

    analyses we treated item as a random factor, participant was not included as random

    factor because it would compete with the fixed factors capturing participant-related

    variables such as the BPI, SES or proficiency. We tested for the significance of the

    following fixed factors: the presence/absence of a discourse or a visual competitor,

    the Simon task score (cognitive control), the backward digit recall score adjusted for

    Page 26 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    27

    age and proficiency (verbal WM), the DELV Articles sub-test score (language

    proficiency), the BPI score (cumulative home language use and exposure), the SES

    score, and age (in months). Age and Simon task scores were centered to facilitate the

    interpretation of the models. The following interactions were also tested in all

    analyses: visual competitor x discourse competitor (yielding the 4 experimental

    conditions), discourse competitor x each participant-related predictor (BPI, SES, WM,

    cognitive control), visual competitor x each participant-related predictor (BPI, SES,

    WM, cognitive control), BPI x SES, BPI x proficiency, WM x proficiency. Gender

    was added as a covariate. Age correlated strongly with other participant-related

    predictors and could therefore not be included in the models without resulting in lack

    of convergence. In the following we report the optimal models.

    To be consistent with the protocol in Fukumura et al. (2010) we excluded

    references to the competitor. The total amount of data points expected, given the

    number of participants (172) and items (20) was 3440, there were 66 no response

    therefore the actual number was 3374. We excluded the following data from all

    analyses: 86 items were excluded because of reference to the competitor, or

    because the utterance was (partly) unintelligible. We also excluded a problematic

    experimental item (N = 115) for a total of 201 items, i.e. 6% of the data.

    In the first analysis, the repeated name was expected to feature as the subject

    in the first sentence that participants produced to describe the second picture in the

    experimental item. As in Fukumura et al. (2010) we excluded a further 155 tokens

    where the target referent was indefinite or lacked a determiner, as well as 310 tokens

    that were not exact repetitions of the named referent. Altogether, 19% of the data was

    excluded from the first analysis. The remaining responses included a total of 1766

    NPs and 942 pronouns.

    Page 27 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    28

    The dependent variable was the likelihood of producing a definite NP (as

    opposed to a pronoun) to identify the target referent in the second picture of the

    experimental items. We used logistic regression to model the probability (in terms of

    logits) associated with the values of the dependent variable. NP use was predicted

    by the visual presence of a competitor (z = 3.21, p

  • For Peer Review

    29

    the Simon task score and the presence of a discourse competitor (z = 2.12, p = .03).

    Further, there was an interaction between language experience

    (monolingual/bilingual) and language proficiency (below/above the monolingual

    mean) (z = -2.15, p = .03) whereby monolingual children below the language

    proficiency mean used more NPs than bilingual children below the language

    proficiency mean. For children above the language proficiency mean there was no

    difference as a function of language experience as shown in Figure 5.

    Insert Figure 5 here

    As children used NPs other than the repeated name in their story continuation,

    in a second set of logistic regression analyses, we investigated the level of

    informativity of the label used to identify the target referent. The dependent variable

    included all the referential expressions that children used to identify a target referent

    where there was evidence of an attempt at discourse integration; we therefore

    excluded all bare nouns, indefinite NPs and indefinite pronouns (155 items), with

    8.3% of data excluded in total. The dependent variable was binary and had two

    levels: (1) underinformative expressions - third person singular pronouns and less

    informative definite NPs (e.g. the man; the lady), and (2) more informative definite

    NPs (repeated NPs from the preamble, semantic substitutions, e.g. the prince for the

    king). Using the WM score where language proficiency and age were partialled out

    did not allow the model to converge, we therefore used the raw WM score. The

    optimal model shows that children were more informative in the presence of a visual

    competitor (z = 2.15, p = .03), while the mention of a discourse competitor had no

    significant effect (z = -1.15, p = .25). The interaction between WM and language

    proficiency was a significant predictor of informativity (z = 9.59, p < .001), while

    none of the other predictors made a significant contribution to the model.

    Page 29 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    30

    As we did earlier, we repeated this analysis including the mean monolingual

    language proficiency as a threshold to investigate a potential language proficiency

    disadvantage for bilingual children in the production of informative NPs. The effect

    of visual competitor was significant (z = 2.14, p = 0.03), as was the effect of WM (z =

    4.88, p < .001). Similarly to what we found in the first set of analyses, monolingual

    children (z = 3.56, p < 0.001) and children with language proficiency above the

    monolingual mean (z = 9.51, p < 0.001) produced significantly more informative NPs.

    The significant interaction between language proficiency and language experience (z

    = -2.18, p = 0.03) showed once again that there was no difference as a function of

    language experience for children whose proficiency was above the monolingual

    mean, but for those below the mean threshold monolinguals produced more

    informative NPs.

    Our third and final set of analyses investigated the possible causes for not

    encoding the target referent with a definite NP or a pronoun (which resulted in

    exclusion from the first and the second analyses). This third analysis revealed

    whether children were able to integrate the discourse information provided in the

    preamble – where the target was introduced with or without a competitor – and the

    target in their own scene description. The dependent variable was the definiteness of

    the target expression used, a proxy measure for discourse integration. Only bare

    nouns were excluded (44 items), on top of the items excluded from all analyses. The

    excluded items amounted to 7.3% of the data in total. In this logistic regression

    analysis, the coefficients indicate the likelihood of using a definite expression, thereby

    integrating the target expression with the preceding discourse without discriminating

    further between more informative full NPs and less informative pronouns. Very few

    items displayed lack of discourse integration: 3% in monolinguals and 4% in

    Page 30 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    31

    bilinguals.

    The presence of a visual competitor adversely affected discourse integration (z

    = -2.87, p

  • For Peer Review

    32

    discourse and visual information in a complex referential communication task.

    Cognitive control skills, verbal WM, , language proficiency, language exposure and

    use, and SES were investigated as predictors of the choice of discourse-appropriate

    anaphoric expressions in the task.

    ��������������������������������������

    With the exception of analysis 2, cognitive control – as indexed by the Simon

    task score – was a significant predictor of NP use. In analysis 1 and 3 – when a

    language proficiency threshold is introduced as a predictor - better cognitive control

    predicted sensitivity to the presence of a discourse competitor. In analysis 3, better

    cognitive control also predicted discourse integration in the absence of the additional

    language proficiency threshold.

    Within the context of the current experiment, the manipulation of the presence

    and discourse mention of a competitor to the target referent unpredictably varied the

    need to resolve a referential conflict. In the condition in which the target had no

    linguistic or perceptual competition no conflict arose. However, in the remaining

    three conditions the discourse and/or perceptual presence of a competitor created a

    referential conflict. The resolution of this conflict required the children to both inhibit

    the preferred choice of a pronoun for a recently mentioned target referent, and to use a

    more informative referential expression (a NP) instead for the benefit of their

    addressee. The unpredictability of an upcoming potential referential conflict

    necessitated a level of monitoring that we hypothesised would correlate with their

    cognitive control abilities as indexed by the performance on the Simon task.

    We never found an interaction between language experience and cognitive

    control in the prediction of NP use suggesting that cognitive control abilities

    Page 32 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50

    51

    52

    53

    54

    55

    56

    57

    58

    59

    60

  • For Peer Review

    33

    conferred an advantage to both groups of children independently of bilingualism,

    contrary to our initial hypothesis. This could be because the bilingual advantage for

    cognitive control abilities in this group of children was modest (albeit significant, see

    also De Cat et al., 2017). In our predictions we also hypothesised that whatever

    bilingual advantage there might be in cognitive control might be offset by bilingual

    children’s lower proficiency skills. We did find, at least in analysis 1 and 3, that the

    degree of exposure and use of the home language negatively correlated with NP use

    before controlling for language proficiency. In an additional set of analyses we

    investigated whether keeping language proficiency constant for the monolingual and

    the bilingual children might mitigate the proficiency disadvantage against the

    bilinguals. Using the mean performance of the monolingual children on the language

    proficiency task we split the groups above and below the monolingual mean, and we

    did repeatedly found that those bilingual children that had language proficiency skills

    above the monolingual mean were no different from their monolingual counterparts in

    the use of informative NPs. They were however no better, as might be expected on the

    assumption of a bilingual advantage in cognitive control. The reason for this lack of

    bilingual advantage, once proficiency was controlled for, is likely to stem from the

    heterogeneity of our bilingual group. We deliberately had very broad selection criteria

    for the bilingual children in our recruitment schools so that we could include all of the

    children that were classified in the UK education system as having English as an

    additional language (EAL learners). This resulted in children who differed vastly in

    the cumulative amount of input and output and in the range of languages spoken. As

    our understanding of the bilingual cognitive advantage is progressively refined we

    now know that a large number of variables, both at the level of the individual

    bilingual speakers and at the level of the tasks used (Mishra et al., 2012), can

    Page 33 of 65

    Cambridge University Press

    Editorial Office of BLC: 1 (804) 289-8125

    1

    2

    3

    4

    5

    6

    7

    8

    9

    10

    11

    12

    13

    14

    15

    16

    17

    18

    19

    20

    21

    22

    23

    24

    25

    26

    27

    28

    29

    30

    31

    32

    33

    34

    35

    36

    37

    38

    39

    40

    41

    42

    43

    44

    45

    46

    47

    48

    49

    50