Diversity in lifestyles and opinions, or in terms of social groups and religions, shape life in Germany. Whether diversity poses a challenge to social cohesion, or rather that it improves society’s potential for innovation and creativity, depends in particular on how a society values diversity. While a major part of the population in Germany has a positive attitude towards diversity, a smaller part sees it as a danger (e.g., Arant et al., 2019; Zick, 2019). Why do some people desire diversity, while others feel threatened by it? A key construct related to this is what is known as Tolerance for Ambiguity (TA): The ability to deal with ambiguous stimuli or situations that are new, unfamiliar, insoluble, complex, uncertain, or open to contradictory interpretations represents a central construct for the acceptance of diversity (Budner, 1962; Frenkel-Brunswik, 1949, 1951; Herman et al., 2010; McLain, 1993, 2009). Originally defined as a stable personality trait, the stability of the TA construct is controversial and is still discussed today (see Stability of the Construct). Furthermore, TA is sometimes assumed to be more of a context-specific construct (see Generality and Dimensionality of the Construct). Negative associations were shown with right-wing political extremism, racism, and ethnic prejudice (e.g., Adorno et al., 1950; Budner, 1962; Gründl & Aichholzer, 2020; Lind, 1987) and positive associations shown with prosocial behaviour, intercultural competence, and creativity (e.g., Caligiuri & Tarique, 2012; Vives & FeldmanHall, 2018; Zenasni et al., 2008). Although it seems to be an important indicator regarding the acceptance of diversity and appears essential for the functioning of a pluralistic society, TA is rarely included in population surveys in Germany. In addition, there are few validated measurement instruments for the German-language area that are suitable for general population surveys. This paper proposes a measurement instrument in German language to address these concerns. The development of such an instrument is based on the English Tolerance for Ambiguity Scale (TAS), which is surveyed with 12 items which measure four sub-dimensions of TA: valuing diverse others, change, challenging perspectives, and unfamiliarity. The 12 items were translated into German using the TRAPD approach (Translation, Review, Adjudication, Pretesting, Documentation; Harkness, 2003; Harkness et al., 2004) and empirically tested within a population survey. To evaluate the psychometric quality of the scale, validity (factorial and construct validity) and reliability (McDonalds’s omega coefficient and test-retest stability) are considered.
The article begins by discussing issues regarding the concept and definition of TA before introducing the state of the art for its measurement. After providing an overview of the methods and data, the results are presented and discussed.
Conceptual and Definitional Issues
Conceptualisations and Definitions of Tolerance for Ambiguity
The concept of TA was introduced by Frenkel-Brunswik (1949, 1951) and was embedded in the work of Adorno et al. (1950) within the social psychological concept of the authoritarian personality, in which the authors conceptualized a personality structure that was particularly susceptible to fascist ideologies and the devaluation of others. Various researchers have attempted to conceptualise TA, taking slightly different conceptual perspectives (e.g., Budner, 1962; Frenkel-Brunswik, 1949, 1951; McLain, 1993, 2009). As just stated, the perspective of Frenkel-Brunswik (1949, 1951) has been associated with authoritarianism and prejudice, with intolerance for ambiguity as intolerance of diversity among people. Further, she defined TA as ‘one of the basic variables in both the emotional and the cognitive orientation of a person toward life’ (p. 113) and as the ability to recognise the ’coexistence of positive and negative features in the same object’ (Frenkel-Brunswik, 1949, p. 115). Following on from this rather psychoanalytical approach, Budner (1962, p. 29) described intolerance for ambiguity as ’the tendency to perceive ambiguous situations as sources of threat’ and TA as ’the tendency to perceive ambiguous situations as desirable’. Thereby, he identified three types of ambiguous situations: novel, complex and insoluble. Furthermore, McLain (1993, p. 184) defined TA as a ’range, from rejection to attraction, of reactions to stimuli perceived as unfamiliar, complex, dynamically uncertain, or subject to multiple conflicting interpretations’. Based on these earlier definitions, TA is interpreted in this paper as the ability to deal with ambiguous stimuli or situations and as a central construct for the acceptance of diversity, whereby such ambiguous stimuli or situations can be characterized as being new, unfamiliar, insoluble, complex, uncertain, or open to contradictory interpretations (Budner, 1962; Frenkel-Brunswik, 1949, 1951; Herman et al., 2010; McLain, 1993, 2009).
Stability of the Construct
According to Frenkel-Brunswik (1949), TA was conceptualized as a stable personality variable, independent of situation and context, and it continued to be treated as such (e.g., by Budner, 1962; McLain, 1993, 2009). Budner (1962) calculated a test-retest reliability of r = .85, but with partly short time intervals of between two weeks and two months. McLain (1993, 2009) did not calculate test-retest reliabilities. Regarding psychometric evidence in general, it must be said that psychometric evidence is a problem in most studies (Herman et al., 2010). Nowadays, it is often considered that TA can change and can be learned (e.g., Bauer, 2011; Streitbörger et al., 2019), although this has not yet been empirically proven. Regarding the development trends of individual TA, not much research has been done. Among the few researchers, Lind (1987) has presented TA as a social aspect of learning that can be learned during study through the socialization in new and complex settings. A similar approach was carried out by DeForge and Sobal (1989) with medical students. Furthermore, Buhr and Dugas (2006) showed a negative correlation between intolerance for ambiguity and the age of students, which may indicate that TA changes with life experience. To the best of my knowledge, there are no multivariate regression analyses or structural equation models to determine the actual determinants of TA.
Generality and Dimensionality of the Construct
TA was mostly considered as an overall construct and therefore measured on a one-dimensional scale (Furnham & Marks, 2013). However, this has been questioned several times and partially contradicted by recalculations of the factor structure of the instrument of Budner (1962) and Mac Donald (1970) (e.g., by Furnham, 1994; Sidanius, 1978). Researchers have addressed the question of whether TA is a general personality trait or whether it can vary across domains. On the one hand, it has been argued that items that are too general function differently in various settings (e.g., Herman et al., 2010; Reis, 1997). On the other hand, a criticism is that one-dimensional scales make it impossible to correlate with theoretically relevant behaviours (Kenny & Ginsberg, 1958; Lauriola et al., 2016). Herman et al. (2010) considered that contextual items that target specific content areas may prove more reliable and may reduce inconsistencies in dimensionality. Regarding dimensionality, different studies have argued for between one and eight dimensions (Furnham, 1994; Herman et al., 2010; Kenny & Ginsberg, 1958; Kirton, 1981; Lauriola et al., 2016; McLain, 1993; Norton, 1975).
Related Psychological Concepts
A concept that is very close to intolerance for ambiguity is the concept of rigidity, which is why the concepts have been treated as equivalents in early studies (Budner, 1962).1 Furthermore, both uncertainty avoidance and risk taking are concepts that are very similar to TA (Furnham & Marks, 2013).2 Another concept that is close to the TA is the concept of need for cognitive closure (NFCC) introduced by Kruglanski et al. (1993).3 NFCC characterises individual differences in the need to get a clear, definitive answer to social facts when processing social information and forming social judgements. Regarding the relationship between TA and the Big Five, there has not yet been sufficient research into whether TA is more of a facet of the Big Five, or whether it is located outside them, such as locus of control (Furnham & Marks, 2013). In either case, TA correlates strongly with two of the Big Five factors: Openness to experience and extraversion (e.g., Caligiuri & Tarique, 2012; Jach & Smillie, 2019; Lauriola et al., 2016; Tynan, 2020). Empirical results regarding neuroticism differ: While Caligiuri and Tarique (2012) showed no relation with TA; Jach and Smillie (2019) found that neuroticism negatively correlated with the TAS by Herman et al. (2010).
Empirical Findings
In the work of Frenkel-Brunswik (1949, 1951) and Adorno et al. (1950) on ethnocentrism and authoritarianism, intolerance for ambiguity is the central component of the authoritarian personality and is positively related to racism and ethnic prejudice. However, the strong link between TA and authoritarianism could only be confirmed by a few researchers, e.g., by Budner (1962)4, who also demonstrated negative correlations between TA and conventionality, religious dogmatism, a positive attitude toward censorship, idealization of parents, and Machiavellianism. In addition, correlations have been found between TA and the propensity of students for more structured careers (Budner, 1962), discipline-specific selection effects in the choice of study (Lind, 1987; Tatzel, 1980) and a systematic relationship between political attitudes and students’ TA (Lind, 1987).5 In a more recent study, Gründl and Aichholzer (2020) showed that uncertainty avoidance was indirectly associated with the support for the populist radical right in Austria. Bardi et al. (2009) showed that intolerance for ambiguity is negatively correlated with general life satisfaction and positively correlated with general feelings of anxiety. Further, it was shown that TA is positively associated with various concepts of intercultural competence (Caligiuri & Tarique, 2012), creativity (Zenasni et al., 2008), and with prosocial behaviour (Vives & FeldmanHall, 2018).
Common Instruments for Measuring Tolerance for Ambiguity
The first attempts for measuring TA date back to Frenkel-Brunswik (1949, 1951), and were conducted as part of the studies on the authoritarian personality (Adorno et al., 1950). Experimental methods and many survey instruments using Likert items for structured questionnaires were developed. Common instruments are briefly presented here (for a review see Furnham & Marks, 2013).
For a long time, the most common and widely used instrument for measuring TA was Budner’s (1962) survey instrument with 16 items. This instrument differentiates according to the potential mode of reaction (submission, denial), the levels of reaction (phenomenological, operative), and the type of ambiguous situation (novelty, complexity, and insolubility). Budner (1962) assumed TA to be a stable one-dimensional personality variable. The instrument had a low internal consistency of α = .49, a good test-retest reliability of r = .85, and a good construct validity. The fact that this instrument was widely used despite lower than acceptable psychometric properties was criticized (e.g., Furnham & Marks, 2013; Herman et al., 2010).
McLain (1993) designed the 22-item Multiple Stimulus Types Ambiguity Tolerance measure (MSTAT-I), which highlighted several stimuli of TA, described as facets of a one-dimensional TA construct. Based on the MSTAT-I, McLain (2009) developed the MSTAT-II by removing items based on feedback from researchers and respondents. The MSTAT-II is composed of 13 items and has an internal consistency of α = .83. However, it is preferred over the MSTAT-I only when survey space is limited (Furnham & Marks, 2013). This instrument also suffers in part from an overgeneralized view or overly abstract formulations (McLain, 2009, p. 986).
Regarding German-language measurement instruments, the Inventory for the Measurement of Tolerance for Ambiguity (IMA) by Reis (1996, 1997) is test-statistically very sound. It distinguishes between five different facets that have been largely confirmed by factor analysis: TA toward problems that seem insoluble, toward social conflicts, toward role stereotypes, toward parental image, and openness to new experiences. However, the instrument is composed of 40 items.
The measurement instrument of Herman et al. (2010) aims to help researchers to better understand intercultural phenomena and to improve the conceptual dimensionality and psychometric evidence of previous measurement instruments. The 12-item Tolerance for Ambiguity Scale (TAS) was designed primarily for cross-cultural contexts, arguing that items that are too general are not appropriate for the various concepts of TA or advocating for context-specific instruments. Herman et al. (2010) drew on the conceptualization and the instrument of Budner (1962) and attempted to improve the instrument by adding new items and removing other items (see the final instrument in Table A1). For evaluation, Herman et al. (2010) computed principal component analyses (PCA), internal consistency, and correlations among items. Accordingly, the scale represents an improvement on Budner’s (1962) scale in terms of factor structure and internal consistency, α(Herman et al., 2010) = .73 vs. α(Budner, 1962) = .49. Moreover, the authors emphasise that the instrument is designed in such a way that it is consistent with the ideas of McLain (1993, 2009). It is composed of four sub-dimensions: valuing diverse others, change, challenging perspectives, and unfamiliarity. However, regarding low internal consistencies of the individual factors, they support a one-dimensional theoretical framework. The dimensional structure emerged through an exploratory approach combined with theoretical considerations, especially regarding the newly created items. These new items were written in terms of cross-cultural relevance and to complement Budner's (1962) existing items. Herman et al. (2010) do not mention this specifically, but it stands out that the sub-dimension ‘valuing diverse others’ fits very well with considerations of Frenkel-Brunswik (1949, 1951), who defined TA as ‘tolerance of the diversity of people’. ‘Valuing diverse others’ is not included in other recent concepts (Furnham & Marks, 2013).
Method
Translation Procedure
For this study, the 12 items of the Tolerance for Ambiguity Scale (TAS) by Herman et al. (2010) were translated from English into German using the TRAPD-approach (Translation, Review, Adjudication, Pretesting, Documentation; Harkness, 2003; Harkness et al., 2004). First, the English-language instrument was translated into German by two independent translators. Subsequently, the translations were discussed under the moderation of a ’reviewer’ and solutions were developed for each item. Next, open questions were clarified with an experienced survey expert (an ‘adjudicator’). Then, the translated items have been tested on a sample of the target population. Based on pretest results, the items were partially modified and tested for comprehensibility using cognitive interviews with probing procedures (comprehensive, general). In a last step, the entire translation process was documented in detail.6 In the final translation, care was also taken to ensure that the items were formulated in simple and easy-to-understand language.
Material
Responses can be given on a fully verbalized five-point response scale: 1 – does not apply at all, 2 – rather does not apply, 3 – neither, 4 – rather applies, and 5 – fully applies. The final translation of the items and the response scale can be found in Table A1 in the Appendix.
Data
The development of the survey instrument is embedded in the project ‘Social Cohesion in Times of Crisis: The Corona Pandemic and Anti-Asian Racism in Germany’. This project included an online panel study in Germany, in which the participants of the first survey were re-surveyed in two further waves (Mayer et al., 2022). The data was collected from December 2020 to May 2021 via an online access panel by respondi, with adjusted quotas for age, gender, and federal state. The target group was people aged 18 to 74 living in Germany with an oversampling of respondents who were born abroad or whose parents were born abroad. The pretest (n = 2,002)7 and the three waves (n = 1,370 for each wave) were used to develop and validate the survey instrument. The sample description for the analysis of the three waves (after cleansing)8 can be found in Table A2.
Analytical Strategy
To ensure high instrument quality, the measurement instrument and its documentation were based on the quality standards developed in 2014 by the German Data Forum (RatSWD) in the working group ’Quality Standards’ (Rammstedt, Beierlein, et al., 2014). To assess the psychometric quality of the German-language TAS, validity and reliability are considered. To do this, the following three steps were taken:
(1) Factorial validity of the TAS was tested using confirmatory factor analysis, applying structural equation modelling (SEM) with maximum likelihood estimation, and using standardised coefficients and values.9 After a first screening of the proposed model of Herman et al. (2010) different model assumptions were tested.
-
M1: Full model with 12 items and one dimension
-
M2: Full model with 12 items and four sub-dimensions
-
M3: Reduced model with 11 items and four sub-dimensions
-
M4: Reduced merged model with 11 items and three sub-dimensions
All multidimensional models (M2–M4) were models with correlated latent variables, as the concept assumes correlations between the sub-dimensions. Model fit was evaluated using CFI, TLI, RMSEA, SRMR, AIC and BIC.10 The heuristic of Hu and Bentler (1999) is used, according to which a CFI and TLI of .95 (or higher), a RMSEA of .06 (or lower) and a SRMR of about .08 (or lower) imply good model fit. Lower values of information criteria such as AIC or BIC indicate a better model fit. In addition, LR-tests were performed. In a next step, model misspecifications were examined using modification indices for model M3. Therefore, the following iterative process was used: The error correlation with the highest modification index was included if they were meaningful in term of content. However, so as not to overload the model no more than four correlations are accepted. The iteration process is documented in Table A3 in the Appendix.
(2) To test the construct validity, Spearman’s correlations were calculated with constructs that are theoretically or empirically related to TA:
-
Big Five Inventory (BFI-10) by Rammstedt, Kemper, et al. (2014)
-
Life Satisfaction analogous to the German Socioeconomic Panel (SOEP)
-
Need for Cognitive Closure Scale (NCCS-5) by Rinke (2020)
-
Adaption of the short Scale Authoritarianism (KSA-3) by Nießen et al. (2019)
-
Translation of the Modern Racism Scale by McConahay (1986)
To test whether one-dimensional scales indeed make it difficult to correlate TA with theoretically relevant behaviours (Kenny & Ginsberg, 1958; Lauriola et al., 2016), correlations are computed for the proposed models for the full and reduced scale, as well as for the individual dimensions.
(3) To determine reliability of the measurement instrument, the test-retest stability was computed.11 In addition, the internal reliability is tested by calculating the McDonalds’s omega (Baldwin, 2019; McDonald, 1999).
Results
Descriptive Statistics
Table 1 shows the mean, standard deviation, skewness, and kurtosis for each of the 12 items of the TAS. Mean and standard deviation separately for gender, age groups, educational background, and migration background of the translated TAS—can be found in Table A4–A7 in the Appendix.
Table 1
Item | M |
SD |
Skewness
|
Kurtosis
|
||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Wave 1 | Wave 2 | Wave 3 | Wave 1 | Wave 2 | Wave 3 | Wave 1 | Wave 2 | Wave 3 | Wave 1 | Wave 2 | Wave 3 | |
1 | 2.79 | 2.87 | 2.89 | 1.01 | 1.07 | 0.98 | -0.01 | 0.00 | -0.03 | 2.50 | 2.33 | 2.70 |
2 | 3.42 | 3.47 | 3.39 | 0.96 | 1.00 | 0.95 | -0.40 | -0.42 | -0.34 | 2.86 | 2.78 | 2.83 |
3 | 3.52 | 3.64 | 3.53 | 1.04 | 1.09 | 1.03 | -0.61 | -0.65 | -0.65 | 2.78 | 2.79 | 2.93 |
4 | 3.65 | 3.69 | 3.62 | 0.97 | 1.02 | 1.01 | -0.54 | -0.64 | -0.53 | 2.96 | 3.02 | 2.90 |
5 | 3.41 | 3.54 | 3.57 | 1.06 | 1.04 | 1.03 | -0.31 | -0.37 | -0.43 | 2.45 | 2.60 | 2.64 |
6 | 3.13 | 3.14 | 3.23 | 0.97 | 1.01 | 0.99 | -0.14 | -0.17 | -0.16 | 2.69 | 2.69 | 2.65 |
7 | 2.89 | 3.03 | 2.99 | 1.05 | 1.11 | 1.04 | -0.12 | -0.18 | -0.16 | 2.40 | 2.40 | 2.54 |
8 | 3.44 | 3.42 | 3.37 | 1.32 | 1.34 | 1.34 | -0.44 | -0.41 | -0.37 | 2.07 | 2.03 | 1.98 |
9 | 4.13 | 4.27 | 4.18 | 0.78 | 0.79 | 0.79 | -0.80 | -1.13 | -0.90 | 3.90 | 4.69 | 4.05 |
10 | 3.03 | 3.12 | 3.02 | 1.40 | 1.44 | 1.39 | -0.10 | -0.19 | -0.10 | 1.73 | 1.69 | 1.72 |
11 | 3.97 | 4.02 | 3.99 | 0.75 | 0.79 | 0.75 | -0.71 | -0.73 | -0.66 | 4.21 | 4.00 | 4.01 |
12 | 3.65 | 3.69 | 3.70 | 1.07 | 1.08 | 1.01 | -0.65 | -0.61 | -0.58 | 2.93 | 2.83 | 2.95 |
Score 1 | 3.00 | 3.00 | 2.96 | 0.51 | 0.54 | 0.51 | 0.15 | 0.06 | 0.11 | 3.42 | 3.36 | 3.30 |
Score 2 | 2.90 | 2.88 | 2.85 | 0.54 | 0.58 | 0.54 | 0.09 | 0.06 | 0.08 | 3.45 | 3.33 | 3.28 |
Note. n = 1,370; M = mean; SD = standard deviation; Score 1 = 12 items; Score 2 = 11 items.
Factorial Validity
Figure 1 shows the proposed structure by Herman et al. (2010) with four correlated factors behind the 12 items: As for the factor loadings, the most noticeable is the low correlation between item_9 and the latent factor L3 (‘challenging perspectives’, .19), which may be an indicator that this item does not fit the factor.12 As for the correlations between the latent variables, the strong correlation between the latent variables L2 (‘change’) and L4 (‘unfamiliarity’) of .85 is striking, which could indicate that these two dimensions are not properly separable. This consideration is also plausible from a theoretical perspective. Table A8 shows which theoretical considerations Budner (1962) had already about the items. According to Budner (1962), item_5, item_6 and item_12 are items which have both the same response reaction (phenomenological submission, PS) and the same underlying ambiguous situation (novelty). The consideration that these three items lie on a common factor therefore seems to make theoretical sense, too.
Figure 1
Based on these findings, as described in section Generality and Dimensionality of the Construct, different models are compared: The full model with 12 items and one dimension (M1), the full model just described with 12 items and four sub-dimensions (M2); a reduced model with 11 items and four sub-dimensions (M3), and a reduced merged model with 11 items and three sub-dimensions (M4). Fit statistics are summarized in Table 2. When a one-dimensional model (M1) was fitted to the data of the first wave, fit indices pointed to a poor model fit, CFI = .654, TLI = .578, RMSEA = .109, SRMR = .085, AIC = 45630.966, BIC = 45818.978. The model specification M2 with four sub-dimensions has a better model fit than the one-dimensional model, CFI = .859, TLI = .806, RMSEA = .074, SRMR = .067, AIC = 45119.287, BIC = 45338.634, and a LR test supports that M1 is nested within M2, χ2(6) = 523.68. Model M2 is acceptable, but still fails to be a ‘good’ model fit according to Hu and Bentler (1999). Model M3 is calculated without item_9, what leads to an improvement in some model fit indices, CFI = .878, TLI = .824, RMSEA = .075, SRMR = .060, AIC = 41927.586, BIC = 42131.266. However, a LR test does not confirm a significant improvement of the model compared to M2, χ2(3) = -3185.70. When the reduced models M3 and M4 with only 11 items are compared, the LR test confirms that M4 is nested in M3, χ2(3) = 19.59, which argues for the four-dimensional structure of the model. However, the results support a strong relationship between the latent variables L2 (‘change’) and L4 (‘unfamiliarity’). In summary, the results support the theoretically derived four-dimensional structure. Omitting item_9 does not lead to a significant improvement of model fit. However, this item can be omitted because it does not fit the intended factorial structure of Herman et al. (2010). Those types of items were already criticised by McLain (2009, p. 976) in the scale of Budner (1962), since items that refer to ‘classifications of people such as teachers or experts’ may be ‘confounded by reference to specific situations that may evoke responses other than reactions to ambiguity’.
Table 2
Model | M1 One-dimensional model |
M2 Four-dimensional model |
M3 Reduced four-dimensional model |
M4 Reduced three-dimensional model |
M5 Advanced reduced four-dimensional model |
---|---|---|---|---|---|
χ2(df) model vs. saturated |
928.257(54) | 404.578(48) | 333.932(38) | 353.519(41) | 226.905(34) |
χ2 (df) baseline vs. saturated |
2595.205(66) | 2595.205(66) | 2489.555(55) | 2489.555(55) | 2489.555(55) |
CFI | .654 | .859 | . 878 | .872 | .921 |
TLI | .578 | .806 | .824 | .828 | .872 |
RMSEA | .109 | .074 | .075 | .075 | .064 |
SRMR | .085 | .067 | .060 | .062 | .051 |
AIC | 45630.966 | 45119.287 | 41927.586 | 41941.172 | 41828.558 |
BIC | 45818.978 | 45338.634 | 42131.266 | 42129.185 | 42053.129 |
LR χ2 (df) | 523.68 (6) M1 nested in M2 (***) |
-3185.70 (3) M3 not nested in M2 |
19.59 M4 nested in M3 (***) |
Note. CFI = comparative fit index; TLI = Tucker-Lewis index; RMSEA = root mean square error of approximation; SRMR = standardized root mean squared residual; AIC = Akaike information criterion; BIC = Bayesian information criterion.
Investigating whether error terms are potentially correlated, and to check whether statistically significant paths should be added, leads to a next step to the consideration of modification indices (MI) for the reduced four-dimensional model M3. After the error correlations were added, the model fit of model M5 improved, and according to Hu and Bentler (1999), the model has nearly a good model fit, CFI = .921, TLI = .872, RMSEA = .064, SRMR = .051, AIC = 41828.558, BIC = 42053.129. However, the high correlation between L2 and L4 is still conspicuous, and presumably these latent factors cannot be properly separated from each other (see Figure 2).13
Figure 2
It was shown that a multidimensional model fits significantly better than a one-dimensional one, with the four-dimensional structural solution fitting significantly better than the three-dimensional one. In conclusion, even if a four-dimensional solution seems statistically to be best, it was shown empirically and theoretically, that the dimensions ‘change’ and ‘unfamiliarity’ are very strongly related, and that item_9 does not fit the factor very well and can therefore be omitted. Regarding, factorial validity of the construct, it should be emphasized that none of the tested models represent the empirical covariances and variances of the items sufficiently well. The model fit proved to be rather low for all tested models, especially with respect to the CFI measurement.
Construct Validity
Spearman’s correlations of this study and correlations of the reference studies can be found in Table 3.14 Correlations are calculated for the full and reduced overall measure and for the individual sub-dimensions for the proposed models.
Table 3
Variable | Wave | rS, 12 [rS, 11] | rS, L1 | rS, L2 | rS, L3 | rS, L4 | rref | Reference, n | TA-Measure |
---|---|---|---|---|---|---|---|---|---|
Extraversion | w1 | .24[.24]*** | .30*** | .10*** | .12[.12]*** | .18*** | .37*** | Caligiuri and Tarique (2012), n = 420 | Mod. Of Gupta and Govindarajan (1984) |
.29*** | Jach and Smillie (2019), n = 308 |
TAS by Herman et al. (2010) | |||||||
.26*** | Lauriola et al. (2016), n = 360 |
Discomfort with Ambiguity | |||||||
Openness | w1 | .22[.21]*** | .12*** | .19*** | .15[.12]*** | .14*** | .29*** | Caligiuri and Tarique (2012), n = 420 | Mod. Of Gupta and Govindarajan (1984) |
.40*** | Jach and Smillie (2019), n = 308 |
TAS by Herman et al. (2010) | |||||||
Agreeableness | w1 | .17[.17]*** | .25*** | .06* | .09**[.08**] | .08** | -.19* | Caligiuri and Tarique (2012), n = 420 | Mod. Of Gupta and Govindarajan (1984) |
.15 | Jach and Smillie (2019), n = 308 |
TAS by Herman et al. (2010) | |||||||
Neuroticism | w1 | -.22[-.21]*** | -.26*** | -.10*** | -.06*[-.03] | -.16*** | .07 | Caligiuri and Tarique (2012), n = 420 | Mod. Of Gupta and Govindarajan (1984) |
-.23*** | Jach and Smillie (2019), n = 308 |
TAS by Herman et al. (2010) | |||||||
.45*** | Lauriola et al. (2016), n = 360 |
Discomfort with Ambiguity | |||||||
Conscientiousness | w1 | -.01[-.01] | .11*** | -.06* | -.05[-.06*] | .01 | .00* | Caligiuri and Tarique (2012), n = 420 | Mod. Of Gupta and Govindarajan (1984) |
.10 | Jach and Smillie (2019), n = 308 |
TAS by Herman et al. (2010) | |||||||
Life satisfaction | w1 | .13[.12]*** | .22*** | .05 | .04[.01] | .04 | -.14* | Bardi et al. (2009), n = 510 | Intolerance for AT |
NFCC | w3 | -.50[-.51]*** | -.29*** | -.52*** | -.11[-.10]*** | -.41*** | .48 | Lauriola et al. (2016), n = 360 |
Discomfort with Ambiguity |
Modern Racism | w1 | -.35[-.33]*** | -.17*** | -.36*** | -.22[-.17]*** | -.15*** | No ref. | ||
KSA-3 | w2 | -.45[-.44]*** | -.16*** | -.52*** | -.18[-.15]*** | -.31*** | -.32* | Budner (1962), n = 502 | Scale of Budner (1962) |
Note. L1 = valuing diverse others; L2 = change; L3 = challenging perspectives; L4 = unfamiliarity.
*p < .05. **p < .01. ***p < .001.
The Big Five personality traits were measured using a validated instrument by Rammstedt, Kemper, et al. (2014). According to the studies mentioned in section Related Psychological Concepts, the translated TAS correlates positively with ‘extraversion’ (rs, 12 items = .24) and ‘openness to experience’ (rs, 12 items = .22). But contrary to Caligiuri and Tarique (2012), respondents who scored high in ‘agreeableness’ had higher degrees of TA. ‘Conscientiousness’ was uncorrelated to the TAS, which is analogous to Caligiuri and Tarique (2012), and ‘neuroticism’ correlated negatively, which is analogous to Jach and Smillie (2019). The translated TAS correlates positively with ‘life satisfaction’, which is in line with the results of Bardi et al. (2009). In the online panel study, ‘need for cognitive closure’ was assessed using a validated short scale by Rinke (2020), the NCC-5. The translated TAS correlates negatively with the NCC-5 (rs, 12 items = -.50), which is in line with empirical and theoretical assumptions, too. As mentioned before, within the work of Adorno et al. (1950) and Frenkel-Brunswik (1949, 1951) on ethnocentrism and authoritarianism, intolerance for ambiguity is the central component of the authoritarian personality and is positively related to ‘racism’ and ‘ethnic prejudice’. As a measure of authoritarianism, a shortened version of the KSE-3 was integrated into the questionnaire. The correlation between the KSE-3 and the German TAS is strong with rs, 12 items = -.45, which is again in line with empirical considerations. ‘Modern racism’ was measured via a translation of the Modern Racism Scale of McConahay (1986). The correlation between the translated Modern Racism Scale and the German TAS is rs, 12 items = -.35.
Regarding construct validity, all findings are consistent with theoretical and empirical considerations. However, the comparability of correlations between this study and the reference studies is limited: First, the results of the reference studies refer in part to other measurement instruments for TA, and second, the reference studies have much smaller, mostly specific samples (students, managers, etc.). Regarding the individual sub-dimensions, in fact, there are considerable differences in places.
Reliability
In terms of test-retest stability, the full scale (12 items) and the reduced scale (11 items) have acceptable test-retest reliabilities (> .7) (see Table A9). McDonald’s omega coefficients are calculated for the overall measure and for the individual dimensions for the proposed models (see Table 4). Internal reliability of the overall measure was acceptable at ωTAS, 12 items = ωTAS, 11 items = .71. Regarding the individual dimensions, the moderately sized factor loadings of the confirmatory factor analysis are also reflected in moderate reliabilities of the sub-dimensions.
Table 4
Model | M2 Four-dimensional model |
M3 Reduced four-dimensional model |
M4 Reduced three-dimensional model |
---|---|---|---|
ωL1 | .61 | .61 | .61 |
ωL2 | .66 | .66 | |
ωL3 | .63 | .70 | .70 |
ωL4 | .52 | .52 | |
ωL2+L4 | .70 | ||
ω TAS, 12 items | .71 | .71 | .71 |
ω TAS, 11 items | .71 | .71 | .71 |
Note. M2 = Full model with 12 items and four sub-dimensions; M3 = Reduced model with 11 items and four sub-dimensions; M4 = Reduced merged model with 11 items and three sub-dimensions.
Discussion
The purpose of this research was to develop a German-language instrument for measuring TA that can be used in population surveys in Germany. This development is based on the English Tolerance for Ambiguity Scale (TAS), which is surveyed with 12 items and which measures four sub-dimensions of TA: valuing diverse others, change, challenging perspectives, and unfamiliarity. The items were translated into German using the TRAPD approach. To evaluate the psychometric quality of the scale, validity and reliability are considered. Regarding the factorial validity or dimensionality of the construct, confirmatory factor analysis argues for a multidimensional model rather than a one-dimensional model, with a four-dimensional solution seeming to fit significantly better than a three-dimensional solution. However, it was shown that the dimensions ‘change’ and ‘unfamiliarity’ are strongly related. In addition, it was shown that one item, item_9, does not fit the model very well, although omitting this item does not result in a statistically significant improvement of model fit. However, item_9 can still be omitted since it does not fit the intended dimensional structure. In terms of reliability, the full scale and reduced scale scores show acceptable internal reliabilities. However, the moderately sized factor loadings of the confirmatory factor analysis are also reflected in moderate reliabilities of the sub-dimensions. Likewise, the full and reduced scale scores show acceptable values in terms of test-retest reliability. Regarding construct validity, the German 11 or 12 item TAS was found to correlate positively with ‘extraversion’ and ‘openness to experience’ and ‘life satisfaction’, and negatively with the ‘need for cognitive closure’, ‘modern racism’, and ‘authoritarianism’. All these findings are consistent with theoretical and empirical considerations. To test whether one-dimensional scales indeed make it difficult to correlate TA with theoretically relevant behaviours (Kenny & Ginsberg, 1958; Lauriola et al., 2016), correlations are computed for the proposed models for the full and reduced total measure as well as for the individual dimensions. In fact, there are considerable differences regarding the individual sub-dimensions.
There are some limitations in this study: Regarding factorial validity, although the attempt was made, none of the models tested were able to reproduce the empirical covariances and variances of the items sufficiently well. Model fit proved to be rather low for all tested models. In addition, PCA as a robustness check, the analogous approach that was chosen by Herman et al. (2010) must be seen critically, especially against the background that PCA is not an analytical procedure for uncovering latent dimensions. Regarding construct validity, another limitation of this study is the comparability of correlations between this study and the reference studies. First, the results of the reference studies refer partly to other measurement instruments for TA, and second, the reference studies have much smaller, mostly specific samples (e.g., students, managers). Therefore, these comparisons should be made with caution. The correlations of the individual dimensions were calculated. Here it can be assumed that individual sub-dimensions are correlated differently with the other constructs, but there are no studies that have investigated this that can be used as references. Furthermore, since the questionnaire was only administered in German, a direct comparison of the English and German TAS is not possible within this analysis.
A particularly positive aspect of this study, however, must be emphasised: The instrument could be evaluated within the framework of a real population survey. Further, since the same instrument could be used repeatedly in three waves, unbiased test-retest stability could be calculated. The time intervals were quite short, so this cannot be seen as a proof of stability of the construct. Future research using analytic approaches such as latent trait analysis (LTA) may be helpful in attempting to separate trait and state effects of TA.
The article makes at least two main contributions. First, it proposes the German-language version of TAS, and second, it tests the scale using a confirmatory approach to factor analysis rather than the exploratory approaches used in previous studies. In summary, it was shown that the German TAS is a valid and reliable instrument for measuring TA in population surveys in Germany.