Research Featured

The relationship between tolerance of ambiguity and multilingualism revisited

This empirical study revisited the relationship between Tolerance of Ambiguity and multilingualism by surveying 302 English-knowing multilinguals in China.

Shijie Wang

01 Jul 2022 • 23 min read

You can access this Open Access article here!

Abstract

Tolerance of ambiguity (TA), an inclination to embrace incongruent scenarios, is highly relevant to second/additional language learning which is an immersion in ambiguity. In applied linguistics research, recent studies have examined TA vis-à-vismultilingualism based on Herman et al.’s (2010) 12-item scale and identified a 3-item dimension of this construct (labelled as TA core) that is hypothesised to exist in different cultural contexts (Wei & Hu, 2019). The present study revisited the relationship between TA and multilingualism by surveying 302 English-knowing multilinguals in China. Factor analysis confirmed the presence of the TA core in the EFL context. A series of Structural Equation Modelling tested the relationships between dimensions of TA and English achievement. Hierarchical regressionanalyses identified multilingualism (respectively operationalised as a global measure of multilingualism and self-rated proficiency in English), age, and education qualification as potentially important predictors for TA; more importantly, the unique contribution to TA from each of these predictors was calculated by means of a “more refined” (Wei, Liu, & Wang, 2020) data analysis approach based on hierarchical regression. Future research directions (e.g., considering a wider multilingual population and employing the above-cited more refined approach) are also suggested.

1. Introduction

Personality psychologists have developed models to classify core personality traits (Roberts et al., 2008), among which the Big Five model is “almost certainly the most well established” (Teimouri et al., 2020). Higher-order personality traits, such as Extroversion, Agreeableness, Conscientiousness, Openness to Experience, and Neuroticism in the Big Five model, have received scholarly attention in applied linguistics research. In contrast, relatively little attention has been paid to lower-order personality traits, such as musicality (e.g., Boll-Avetisyan et al., 2016), perfectionism (e.g., Dewaele et al., 2017; Gregersen & Horwitz, 2002), risk-taking (e.g., Dewaele, 2012), and tolerance of ambiguity (Wei & Hu, 2019). However, as Dörnyei and Ryan (2015, p. 34) observe, it is “with lower-order personality constructs” that “a large proportion of the meaningful findings” emerged concerning the relationship between personality and L2 achievement. Hence more studies of lower-order personality traits are needed.

The present study aims to further the current understanding of lower-order personality traits by focussing on tolerance of ambiguity (TA), an individual difference (ID) variable defined as “the tendency to perceive ambiguous situations as desirable” (Budner, 1962).¹ Previous researchers hypothesised that this construct is pertinent to L2/LX learning due to its ambiguous nature characterised by the fact that “L2 learning involves the appropriation of new and/or modified patterns of language and meaning, which can be unfamiliar and complex” (van Compernolle, 2017, p. 319). Since Rubin's (1975) influential “good language learner” research, L2/LX learners with a higher TA score have been recognised to be one important trait, because a “good language learner is … comfortable with uncertainty … and willing to try out his guesses” (Dörnyei & Ryan, 2015, pp. 32; 45).

Therefore, in recent years, the construct of TA has been extensively assessed in relation to language-related variables in the field of applied linguistics. Dewaele and Li (2013) represents the very first article in our field adapted the TA instrument from Herman et al. (2010) which reported and tried to solve the “measurement challenges” (p. 59, see Literature Review for more) of TA. In this seminal paper, by surveying 2158 multilinguals of different nationalities, Dewaele and Li (2013, p. 236) found that multilingualism (operationalised as what they called “Global Measure of Multilingualism”, GMM) exerted “a small but significant effect on TA”. A partial replication in the Chinese EFL context confirmed that GMM was one “important” predictor for TA (Wei & Hu, 2019). In this follow-up study, they found that the factorial structure of TA partially overlapped with that of Herman et al. (2010), referring to a three-item factor as “TA core” which was hypothesised to be stable in different research settings. Wei and Hu's (2019) study was conducted in the Chinese EFL context and seems to have made an important breakthrough in ascertaining the TA factorial structure. If future replication studies were to take place, another study in the same EFL context would be prioritized.

Partly motivated by the above call, the present study revisits the relationship between TA and multilingualism, by empirically confirming Wei and Hu’s (2019, p. 1210) very “first attempt” to evaluate the applicability of Herman et al.’s (2010)instrument in EFL contexts. This study contributes to four primary areas. First, it contributes to the current studies of (lower-order/higher-order) personality traits, complementing the fruitful research on the cognitive correlates of language-related variables (cf. Dewaele, 2012; Dewaele & van Oudenhoven, 2009; Wei & Hu, 2019). Second, it helps ascertain the factorial structure of TA, so as to meet “measurement challenges” surrounding TA (Herman et al., 2010, p. 59) in EFL contexts and possibly other contexts as well. Third, we further depict Chinese multilinguals' psychological profiles, based on which the upcoming studies could explore the relevant ID variables much deeper. Last but not least, we endeavour to echo the call for fuller use of effect size (in terms of both its reporting and interpreting) (Larson‐Hall & Plonsky, 2015; Wei et al., 2019) in the field of multilingualism and beyond, by underlining effect size (rather than the p-value) not only in analysing the data from the present study but also in synthesising findings from previous studies.²

2. Literature review

2.1. Research measuring TA based on Herman et al.’s (2010) scale

In the field of applied linguistics, one prominent line of inquiry concerning TA has utilised an instrument adopted or adapted from the TA scale developed by Herman et al.’s (2010) who identified three main challenges surrounding the instrumentalisation of TA: (1) weak psychometric attributes, (2) potential multidimensionality, and (3) the impact of context on individual TA. Specifically, first of all, many earlier measurements of TA suffered from relatively weak reliability and little in-depth discussion of the factorial structure. Second, previous studies disagreed on the construct dimensionality, with some arguing for TA's unidimensionality and others suggesting two, three, or even up to eight dimensions of TA. Third, in different research contexts, the instruments of TA did not converge. These pointed to the need for a sound measurement of TA. Motivated by these challenges, Herman et al. (2010) developed a new 12-item TA scale based on Budner's (1962) 16-item one.

In this first study using Herman et al.’s (2010) TA scale in the field of applied linguistics, Dewaele and Li (2013) tested the relationship between TA and multilingualism (F(2,1978) = 6.0, p < .003, η² = 0.008. To gauge the complicated concept of multilingualism, they used a “global measure of multilingualism” (GMM), namely “the sum of oral and written knowledge in various languages” (ibid., p. 232). This instrument is particularly useful in “distinguish[ing] sextalinguals with limited knowledge of three languages from trilinguals with advanced knowledge of three languages” (Dewaele & Li, 2014, p. 241). In more recent work by Dewaele and colleagues, GMM was described as a more “granular” instrument (Dewaele & Botes, 2020, p. 813) than “number of languages known”³ and hence is employed in the present study. Dewaele and Li (2013) also tested the relationships between TA and variables such as age (r (1956) = 0.084, p < .0001), gender (F(2, 1980) = 1.44, p = ns), stayed abroad or not (F(2,1980) = 11.0, p < .0001, η² = 0.011), and grew up monolingually or multilingually (F(2,1980) = 0.39, p = ns). However, relying on statistical significance level, they might diminish the value of their findings without an in-depth interpretation of the effect sizes. For statistically insignificant results, they only reported “ns”. Also, they should have provided the post-hoc analysis for ANOVA, based on which we could benefit more from the findings.

Secondly, the study (N = 379) by van Compernolle (2016), a partial replication of Dewaele and Li's (2013) study, also confirmed the positive correlation between GMM and TA (Spearman rho = .19, p < .0002). However, this study, unfortunately, did not compare its statistically significant result (viz. Spearman rho = .19) with its counterpart (i.e., eta² = 0.008, roughly equivalent to an r-family effect size of 0.028) from Dewaele and Li (2013). When the above two effect size values are compared, a huge disparity (about 0.200 vs 0.028) emerges. Adequate explanations for such disparity may not be possible before there are enough replications. Additionally, van Compernolle (2016, p. 66) found that “TA is related to age” (Spearman rho = .21), but again did not compare his finding with the “strong positive relationship” (r = 0.084) from Dewaele and Li (2013, p. 235). The present study attempts to overcome this limitation by synthesising the different effect size types (e.g., eta² and Spearman rho) reported in previous studies.

Thirdly, Liu et al. (2017) partially replicated Dewaele and Li (2013) by surveying 132 Singaporean undergraduates, claiming that “No significant correlation between global proficiency on TA was found, p = .196”. This replication study conducted in Singapore was interesting because it found a statistically insignificant link between multilingualism and TA. Nevertheless, Liu et al.’s (2017) data analysis, unfortunately, failed to report effect sizes. Furthermore, although Liu et al. (2017) claimed that they adapted the TA scale from Dewaele and Li (2013), similar to van Compernolle (2016), they did not present the reliability or validity information of the instrument. It is suggested that researchers conducting quantitative studies in applied linguistics should report the reliability and validity information of their instruments (Kong & Wei, 2019). Considering the transparency in instrumentation, the present study will follow this suggestion.

Fourthly, as another partial replication of Dewaele and Li (2013), Wei and Hu's (2019)study re-evaluated the previous result and emphasised that the wording “small but significant” possibly diminished the value of Dewaele and Li's (2013) finding. Accordingly, using Dewaele and Li's (2013) effect size as “a useful starting point”, Wei and Hu (2019) proposed a topic-specific effect size interpretation scheme in which 1% represents a typical cutting point for the relationship between multilingualism and TA, which will be used in the present study together with the widely used Cohen's (1988) one to interpret the results. After proposing this system, they surveyed 260 English-knowing Chinese multilinguals, finding that (1) GMM, the number of languages known, and gender emerged as “important” predictors for TA, which respectively explained 1.4%, 1.9%, and 1.3% of more than 1% in the TA variance, and (2) through hierarchal regression, TA was found to be predicted by the number of languages known (accounting for 1.9% of the TA variance, p = .027), gender (1.3%, p = .017), education (0.4%, p = .028), and length of stay abroad (0.3%, p = .041). Although these authors managed to ascertain the relative importance of each independent variable in predicting TA, their analysis would be improved if a more refined approach (viz. providing a range of effect sizes, rather than one single effect size for each predictor, Wei et al., 2020) were adopted (see also Section 3.5). Hence, the present study will revisit the pertinent relationships with more rigorous approaches (see Research Question 2 & 3).

Also, in Wei and Hu's (2019) study, they identified one 3-item factor (labelled as “TA core”) which paralleled that in Herman et al. (2010). This led to their hypothesis of the cross-cultural stability and validity of this 3-item TA core. This hypothesis potentially contributes to addressing the above-reviewed measurement challenges, including the disagreement regarding construct dimensionality and the contextual influence on individual TA. Therefore, the present paper will also replicate the factor analysis process to test the hypothesised existence of the TA core (see Research Question 1). Specifically, we will revisit the TA-related findings in one outer-circle country (i.e., the Chinese EFL context). Together with the results based on some inner-circle and outer-circle contexts (e.g., USA, UK, and Singapore) examined in previous studies, the present study could help substantiate the hitherto hypothesis about TA. Besides, China by itself is an important context for the research upon the relationship between TA and language achievement because (1) the huge typological difference between Chinese and English poses more “ambiguities” to the language learners, (2) the wide use of prescriptive instruction somewhat hinders the possibility of tolerating the ambiguities in English, and (3) it accommodates the largest EFL community around the world with over 390 million learners after 2000 (Wei & Su, 2012; You & Dörnyei, 2016).

2.2. Research measuring TA with other instruments

Another line of inquiry concerning TA has employed instruments other than Herman et al.’s (2010) scale, such as the Second Language TA Scale (SLTA, see Ely, 1995; Başöz, 2015) or other self-developed items (e.g., Thompson & Lee, 2013).

For example, Dewaele and Ip (2013) surveyed 73 secondary-school Chinese EFL learners in Hong Kong with the SLTA questionnaire, identified “a significant correlation” (p. 56) between SLTA and self-rated proficiency in English (r = 0.684, p < .0001) which will be compared with our results later. In this study, one limitation is that it did not report effect sizes for the statistically non-significant relationships between SLTA and other relevant variables (e.g., dialect knowledge), which may mislead the audience in data interpretation, especially in a study with small sample size (representing lower power). To complement it, the present study will report effect sizes for both statistically significant and non-significant results, following previous suggestions (Larson‐Hall & Plonsky, 2015; Wei et al., 2020). However, as a domain-specific scale, SLTA also suffers from some “measurement challenges” such as the divergence in comparative research.

Based on a self-developed instrument to measure second language (L2) anxiety, Thompson and Lee (2013) identified an emotion-based TA factor (fear of ambiguity) as a subdimension of L2 anxiety. In their study, the participants with higher proficiency in English (viz. a measure of multilingualism) had a lower level of fear of ambiguity in L2 learning. In Thompson and Lee's (2013) study, TA appears to be a by-product of their research focused upon L2 anxiety; furthermore, their instrument for TA⁴ is strongly oriented towards measuring emotions (viz. more transient psychological IDs), rather than personality traits (viz. more stable psychological IDs). Hence, the present study chooses to focus more on studies treating TA as a personality trait than on Thompson and Lee's (2013).

To corroborate these results of extant quantitative studies, the present study will further this line of research via (1) re-examining the factorial structure of TA based on a new sample, (2) confirming the magnitude of effect between TA and achievement variables, (3) evaluate to what extent other sociobiographical variables (e.g., education) could predict TA. Methodologically, the present study will try to overcome the limitations of some previous studies including (1) failing to make full use of effect sizes, (2) failing to provide sufficient validity and reliability information, (3) failing to critically evaluate the instrument used in their studies, and (4) with specific reference to regression, using only one single effect size value/type.

3. The study

Research questions (RQs)

RQ1: What is the factorial structure of Herman et al.’s (2010) TA scale in the Chinese EFL context?

RQ2: To what extent does multilingualism predict TA?

RQ 3: To what extent do other selected sociobiographical variables (e.g., age, education qualification, gender, and length of stay abroad) predict TA?

3.1. Instrument

The participants were asked to fulfil an English-language questionnaire comprising four main sections, including basic sociobiographical information, TA scale, GMM scale, and self-rated English proficiency.

To measure TA, Wei and Hu's (2019) 11-item Likert scale (1 = “strongly disagree” and 5 = “strongly agree”) was adopted. In Wei and Hu's (2019) study, after testing the psychometric properties (e.g., through reliability analysis) of Herman et al (2010)original TA scale, it was found that one item (which was later deleted) “dragged the overall Cronbach alpha value down to below 0.60” (p. 1213). Hence, Wei and Hu’s (2019) 11 items, rather than Herman et al.’s (2010) original 12 items, were used in the present study. Prior to formal administration, the instrument had been piloted among 30 Chinese multilingual individuals. As the reliability analysis generated an acceptable overall Cronbach alpha value (α = 0.721), all 11 items were kept in the later formal administration of the survey.

To gauge the complicated variable of multilingualism, two measures were used to yield a more reliable result. Specifically, the first measurement was based on the GMM instrument used in both Dewaele and Li (2013) and Wei and Hu (2019). It indicates each respondent's general level of multilingualism by totalling the oral and written self-perceived proficiency scores (e.g., L1 writing + L1 speaking + L2 writing + L2 speaking + …). The second measure of multilingualism, operationalised as self-rated proficiency in English, was gauged with the self-report proficiency instrument from Taguchi et al. (2009); this measure was adopted largely because it is clear and easy to use and has been adopted in recent studies on IDs similar to TA (e.g., Teimouri et al., 2020).

3.2. Procedure

The study included two segments, the pilot study (N = 30) and the main study (N = 302). Based on the feedback from the piloting, the questionnaire was modified stylistically (e.g., underlining some parts of several items). For instance, the lines underneath the two parts of the statement “What we are used to is always preferable to what is unfamiliar.” were added to make the reading easier for the participants. The formal questionnaire was released and distributed on an online platform, Wenjuanwang.com, from October to November 2018. Throughout the study, the participants were notified about their rights to terminate the filling process. The anonymity of the respondents was and will be protected.

3.3. Participants

Overall, 302 Chinese multilinguals (200 females, 102 males) partaken in the present study, the age of whom ranged from 18 to 42 (mean = 23.15, SD = 3.708). Concerning education qualification, the study involves 174 respondents who had or were studying for bachelor's degrees, 94 master's degrees, and 16 doctoral degrees, with 18 missing values. At the time of the survey, the majority of the sample (n = 249, 82.5%) had never lived abroad.

All the participants recognised Chinese as their first language with at least one additional language. Specifically, there were 271 bilinguals (90%), 24 trilinguals and seven quadrilinguals. English (n = 302) was the L2 of all the participants; Japanese (n = 11) was the most frequent L3, followed by French (n = 6), Spanish (n = 3), German (n = 2), Korean (n = 1), and Russian (n = 1). The pattern of L4 is French (n = 3), Korean (n = 2) and Spanish (n = 2).

3.4. Data analysis

For RQ1, exploratory factor analysis was employed to identify the underlying factors of TA scale in the present sample. The reason why confirmatory factor analysis was only conducted for reference but not highlighted is that Herman et al.’s (2010) TA scale shows an inconsistent factorial structure in the previous studies. RQ2 which is interested in the influence of multilingualism upon TA was answered via structural equation modelling. RQ3 was addressed via hierarchical regression, to explore the unique contribution from other sociobiographical variables on TA. Different from the traditional approach, we adopted Wei et al.’s (2020) suggestion about “attempt[ing] all possible sequences and provide a range of effect sizes for each predictor”. This is because the entering sequence of the variables in the regression model may bias the variance (Larson-Hall, 2016). When well-established theories (or strong logical reasons; see Tabachnick & Fidell, 2013) are absent, the proposed “more refined” method may be more valid. The standard alpha level (i.e., 0.05, non-directional) for significance testing was implemented below. Exact p-values were reported, except the very small ones which were noted as p < .0005.

4. Findings and discussion

4.1. The factorial structure of the TA scale

As the factorial structure of Herman et al.’s (2010) TA scale is not stable in the previous studies, no solid theory could support the direct use of CFA (Fabrigar & Wegener, 2011, p. 28). Before EFA, the assumptions were checked including sphericity (Bartlett's test χ² (55) = 581.527, p < .0005), sampling adequacy (Kaiser–Meyer–Olkin = 0.756), and sample-size-to-variables ratio (27.7). The extraction method was principal components analysis which was also used in previous studies. It is predicted that dimensions of the TA scale may be inherently interrelated especially “for naturalistic data, and certainly for any data involving humans” (Field, 2009, p. 644). Therefore, the oblique rotation method (specifically, direct oblimin) was adopted. Both the eigenvalue >1 and the scree plot were used to determine the factors extracted.

The EFA extracted three factors which account for a total of 52.542% of the variance (see Appendix 1). As it is suggested that “at least three to five measured variables reflecting each common factor should be included, although even more is generally desirable” (Fabrigar & Wegener, 2011, p. 25), the first two factors will be retained in later analysis. The first factor (TA1, TA 4, TA5, TA9, TA10, TA11) is a new dimension that does not parallel with that in other studies (Cronbach alpha = .672; accounting for 28.862% variance of the TA scale), whereas the second factor (TA3, TA7, TA8) corresponds to the “challenging perspectives” in Herman et al. (2010) and “TA core” in Wei and Hu (2019) with a Cronbach alpha value of 0.718 (accounting for 13.081% variance of the TA scale). The consistent extraction of “TA core” is an important finding, which coheres with the hypothesis that this factor is the “very part of TA that could be found across different cultural contexts” (Wei & Hu, 2019).

To confirm the above results, CFA was also performed based on the factorial structure identified above. For the 6-item TA factor (henceforth “TA new”), the CFA model generated a series of model fit indices including CFI (0.985), TLI (0.975), RMSEA (0.034), SRMR (0.031), and GFI (0.988) (see Appendix 3 for more details), which indicated that this model had high construct validity. In contrast, as the TA core only comprised three items, the relevant model fit indices could not be generated; but the factor loadings (0.5 or higher) for each of those three items (see Fig. 1) indicated that these items were suitable for later analysis.

4.2. GMM, self-rated English proficiency, and TA

Relevant assumptions for structural equation modelling (SEM) were checked including normality of relevant variables (Skewness = −0.904 to 0.431; Kurtosis = −1.525 to 2.148) and linearity (using scatterplot). The other assumptions (e.g., multicollinearity and independent errors) were checked after the model was developed by observing the VIF value and Durbin-Watson test, respectively. As two instruments were used (i.e., GMM⁵ and self-rated English proficiency⁶), two parallel sets of SEM models were tested.

The first sets of SEM models indicated that GMM exerted statistically significant effect on “TA core” (β = 0.152, p = .008, Standardised β = 0.223, R² = 0.050, as shown in Fig. 1) and insignificant effect on “TA new” (β = 0.021, p = .486, Standardised β = 0.061, R² = 0.004, as shown in Fig. 2).

The second sets of regression demonstrated that self-rated English proficiency predicted “TA core” (β = .119, p = .016, Standardised β = 0.178, R² = 0.032, as shown in Fig. 3) and “TA new” (β = 0.012, p = .655, Standardised β = 0.035, R² = 0.001, as shown in Fig. 4) in a similar pattern.

The relationships between “TA core” and the two instruments of multilingualism all met the “large” benchmark proposed by Wei and Hu (2019) and the “small” benchmark by Cohen (1988), which resonated with the previous findings (e.g., 1.4% from Wei & Hu, 2019; η² = 0.008 from Dewaele & Li, 2013). The reason why the effect size generated in our study is higher than the previous two may be the use of SEM, which designs latent variable models allowing for the calculation of error in the data (e.g., sampling error). Interestingly, multilingualism was not statistically significantly predicted by “TA new”, which leads to future investigation. The effect sizes showed that GMM was an important predictor for TA, and English proficiency was a potentially important one. These findings can be refined with results from hierarchical regression (see Section 4.3).

The difference between TA core and TA new was reanalysed to explore the reason behind the above relationships. It could be found that TA new covers the content about tolerance of unfamiliar values and experiences. To compare, TA core represents the content on openness to change. If this observation is valid, future studies should attempt to explore the much more complicated mechanism behind those relationships. For instance, we could examine whether the positive prediction from TA core to achievement is mediated by behavioural variables (e.g., the length of staying abroad).

4.3. Other selected sociobiographical variables and TA

Before the hierarchical regression analyses, six variables of interest were examined to explore whether they could be used as predictors in the regression model. A series of correlation analyses generated the correlation coefficient with “TA core” for the following variables: GMM (r = 0.195), English proficiency (r = 0.106), age (r = 0.209), and education qualification (r = 0.156); two independent-samples t-tests found that “stay abroad” (r = 0.043) and gender (r = 0.046) respectively exerted very small effect on TA. As the effect sizes of “stay abroad” and gender fell below the “typical” benchmark (viz. 0.1) in Wei and Hu's (2019) interpretation system, they were excluded from later regression, whereas the other four variables were retained.

As mentioned above, we adopted a “more refined” approach to test all the entering sequences of four predictors (i.e., 24 scenarios), generating effect size ranges but not individual effect sizes. After performing this procedure, we generated the effect size ranges as follows: age (2.9%–4.9%), GMM (1.6%–3.7%), English proficiency (0.5%–1.5%), education qualification (0.3%–2.1%). The upper bound of the former two exceeded the “large” effect size benchmark (2%) in Wei and Hu's (2019) system, indicating that both age and GMM could be important predictors for TA; it is noteworthy that the lower bound for the former (2.9%) was higher than the large benchmark (2%), whereas that for the latter (1.6%) dropped below 2% (but still above the typical level), suggesting that age was a more important predictor for TA than GMM. The ranges of the latter two centred around the “typical” benchmark (1%), indicating that English proficiency and education qualification could be potentially important predictors for TA.

The findings of each independent variable were in line with those in previous investigations. First of all, the finding of GMM (1.6%–3.7% variance-accounted-for) in predicting TA echoed the results from Dewaele and Li (2013) and Wei and Hu (2019), which had recognised GMM as an important predictor of TA (albeit with effect sizes smaller than 2%). These studies seem to have demonstrated that multilingualism (perhaps with multiculturalism) can benefit individuals by providing positive personality resources.

Secondly, in relation to age, Dewaele and Li’s (2013, p. 235) seminal study identified “a strong positive relationship” (r = 0.084) between TA and participants” age, and our study yielded a similar positive correlation (2.9%–4.9%). Our effect size was very similar to van Compernolle’s (2016, p. 66) finding (viz. Spearman rho = .21) for the relationship between TA and age, and the discrepancy could be attributed to the different ways to denote TA. Furthermore, when controlling the effect of education qualification on the relationship between age and TA, the effect is still statistically significant (p = .037) at the “typical” level in Wei and Hu's (2019) system and “small” level in Cohen's (1988) system (△R² = 0.012). Therefore, it appears that, as a multilingual grows older, (s)he becomes more tolerant of ambiguous situations.

Thirdly, the finding regarding English proficiency may not be directly compared with those based on previous research as it was the first time to examine the relationship between this variable and TA. The finding that English proficiency was a potentially important predictor (0.5%–1.5%) necessitates corroboration, modification, or falsification by future empirical studies. It seems that the higher the participants’ self-rated proficiency in English, the higher level of TA they had. This finding was consistent with our finding concerning GMM, which is expected because an overwhelming majority of the participants were Chinese-English bilinguals in an additive environment.

Fourthly, regarding the predictor “education qualification”, our finding (0.3%–2.1%) was compatible with the corresponding effect size (0.4%) from Wei and Hu (2019). Thanks to the employment of the more refined approach, although education could only generate a relatively weak relationship in certain scenarios, it is a potentially important variable predicting TA with an upper bound of 2.1% variance-accounted-for.

5. Conclusion

Responding to Wei and Hu's (2019) call for further research concerning whether there is a core for TA, a lower-order personality trait, the present study has provided an affirmative answer. Specifically, it has revealed a three-factor structure of TA among English-knowing multilinguals in an EFL context, and confirmed the hypothetical “TA core” comprising three items originally developed by Herman et al. (2010). This study has also showed the different influences of selected sociobiographical variables on TA, including age (2.9%–4.9%), multilingualism (GMM, 1.6%–3.7%; self-rated proficiency in English, 0.5%–1.5%), and education qualification (0.3%–2.1%). Chinese foreign language learners' psychological profile was further depicted: when one has higher proficiency in English, (s)he likely a has higher TA. While many statistics textbooks claim that regression reveals a cause-and-effect relationship, we suggest that it is probably better not to speculate about causality but instead focus on the strength of association (measured by effect size). As Dewaele and Li (2013) rightfully highlighted, the causal path can in effect be multidirectional (i.e., multilingualism can be both a cause and an effect). Although having higher multilingualism in at least one additional language can push an individual to develop a higher level of a particular positive trait (e.g., TA), it is also possible to argue that an individual, who was born with a certain personality profile or developed such a profile early in life, is more likely to develop higher multilingualism later in life, through making active choices.

In connection with methodological contributions, two suggestions are proposed for future studies. First, we confirm the value of using Wei et al.’s (2020) more refined data analysis approach based on hierarchical regression (i.e., providing a range of effect sizes for each predictor) and hence would advocate this approach in future research. Future studies employing hierarchical regression would stand to gain from this procedure. Second, although Herman et al. (2010) claim that their TA scale is “a conceptually clear, internally consistent assessment tool” (p. 60) and demonstrate “its improved utility” over Budner's (1962) traditional TA inventory (p. 62), findings from both the present study and Wei and Hu (2019) reveal that the pertinence of Herman et al.’s (2010) instrument is largely confined to English-as-a-native-language or ESL contexts. We suggest that the component of “TA core” (viz. Items 3, 7, and 8, see Appendix 2) confirmed in this study be included in the instrumentation of future research that aims to measure TA in EFL contexts.

The present study represents one immediate replication following Wei and Hu (2019) in the same EFL context. Future studies will need to be undertaken with the other EFL contexts to start with, and may gradually extend to ESL contexts, so as to ascertain to what extent TA core exists in different cultural contexts. For international researchers working in EFL settings other than China or in ESL contexts, the present study is still useful as it provides food for thought in connection with an assessment of the profound question “Does multilingualism help shape personality?”. To date, empirical evidence has shown that multilingualism does help shape some higher-order personality traits (e.g., Open-mindedness, cf. Dewaele & Botes, 2020) and lower-order ones (e.g., TA). Unfortunately, in recent years the deficit view of multilingualism/multiculturalism has begun to re-emerge amongst politicians (e.g., Noack, 2015). The above two suggestions concerning methodology, together with adherence to “fuller use of effect sizes” (Kong & Wei, 2019, p. 50), will benefit researchers in different settings who aim to combat this deficit view with methodologically rigorous research on how multilingualism is linked to personality traits.

Besides these theoretical and methodological contributions, this study cannot be immune from three limitations. First, the foreign language (in this context, English) questionnaire may limit the participants to comparatively proficient English learners. Different results might emerge when a broader multilingual context is assessed with a validated TA instrument through the medium of the participants’ first language. Second, the online data collection method may be biased in some respects (Wilson & Dewaele, 2010). Therefore, it is useful for future research to collect data based on a more “closed” paper-and-pencil design, to see whether the findings could be different (e.g., by incorporating the index of response rate; Wei & Hu, 2019). Third, in addition to the two measures of multilingualism employed in this study, other useful measures (e.g., Thompson & Khawaja, 2016operationalisation of multilingualism) may generate different findings. These limitations need to be addressed with further research efforts.

Author statement

Rining Wei: Conceptualization, Resources, Writing - Original Draft, Writing - Review & Editing, Supervision, Project administration, Funding acquisition, Validation, Data Curation.

Yifan Kang: Methodology, Investigation, Writing - Original Draft.

Shijie Wang: Writing - Original Draft, Writing - Review & Editing, Conceptualization, Validation, Project administration, Formal analysis, Data Curation, Visualization.

Acknowledgements

The authors would like to extend their appreciations to the anonymous reviewers and the editor for their constructive comments on an earlier version of this paper. All remaining inadequacies are the authors' responsibility. The writing of this paper was supported by the Research Enhancement Fund of Xi'an Jiaotong-Liverpool University (REF-19-02-01).

Footnote

¹ Since Frenkel-Brunswik (1949) who first proposed TA as an individual difference (ID) variable, different scholars have defined TA differently. But most definitions seem to have retained the essence in Budner's (1962) classic definition (see Furnham & Ribchester, 1995 for a review). While Budner's (1962) conceptualization was clear and useful, his initial TA instrument had weak psychometric attributes (e.g., an average internal consistency of 0.49). Fortunately, building on Budner's (1962) work, Herman et al. (2010) through several iterations managed to refine both the conceptualization and measurement of TA.

² The main limitation of significance level is that p-value is substantially influenced by the sample size (i.e., big sample size always entails a statistically significant relationship; Wei et al., 2019). We would encourage the readers to focus on the estimation of effect size indexes over p-value (cf. Kong & Wei, 2019; Loewen et al., 2014).

³ This measure of multilingualism (“number of languages known”) has mostly been used in earlier studies. For instance, in the field of applied linguistics, it was employed not only in Dewaele and van Oudenhoven (2009), “the first study” (Dewaele & Botes, 2020, p. 813) to investigate whether multilingualism could influence personality traits as measured by the Multicultural Personality Questionnaire (MPQ) but also in its later (partial) replications (e.g., Dewaele & Stavans, 2014). Similarly, in the field of intercultural studies, to explore the potential links between “foreign language mastery” (viz. multilingualism) and the five IDs on the MPQ, Korzilius et al. (2011)employed “number of foreign languages spoken” to measure the former, in addition to self-rated proficiency in each foreign languages known.

⁴ Sample items from this TA instrument include “Even if I am well prepared for English class, I feel anxious about it” and “I worry about the consequences of failing my English class”.

⁵ As the ten items for GMM will harm the parsimony principle of developing SEM model, item parcelling strategy was used according to Marsh et al.’s (1998) suggestion.

⁶ Using this single item instrument to measure English proficiency, the authors set the factor loading to be 0.927 and error to be 0.215, based on the assumed reliability of 0.8 and the variance of 1.074.