A Psychometric Evaluation of the Big Five Inventory (BFI) in an Eastern Africa Population

The Big Five factor model is one of the most frequently used models in modern personality psychology. It captures personality in terms of five broad dimensions, namely Extraversion, Agreeableness, Conscientiousness, Emotional Stability/Neuroticism, and Intellect/Openness to experience, discovered through a series of psycho-lexical studies. We translated the Big Five Inventory (BFI), a metric developed to operationalize the Big Five personality structure, into the Swahili language and evaluated the psychometric properties of both the newly developed Swahili version and the original English version in a sample of 200 university students (114 women; 86 men; average age: 20.16) in Kenya. Principal Component Analysis with varimax rotation of the five factors was conducted for both raw and ipsatized scores, for both language versions of the BFI. Only two factors were fully replicated, Conscientiousness and Neuroticism. Conscientiousness factor was fully replicated using ipsatized scores of the English BFI, while Neuroticism was replicated with both the raw and ipsatized scores of the English BFI. The Swahili version of the BFI failed to unambiguously replicate any of the five factors with neither the raw nor the ipsatized scores. Results also showed poor-to-moderate scale reliabilities of both the English and the Swahili versions of the BFI.

more generally to a Swahili speaking population.The Big Five factor structure, which consists of five super-ordinate dimensions of human personality, namely Extraversion (E), Agreeableness (A), Conscientiousness (C), Emotional Stability or Neuroticism (N), and Openness to Experience or Intellect (O), has its roots in the psycho-lexical approach that exploits the natural language as a starting-point for the development of a full vocabulary of trait descriptive terms.This way, natural language is taken as a reservoir of terms encoded therein after being found to be useful in communicating behavioral information and dispositions of people.This five-factor structure, first discovered by Fiske (1949), has over the years gone through a series of refinements culminating in the current model, popularly referred to as the Big Five (De Raad & Mlačić, 2015;Goldberg, 1981).
The applicability of the Big Five personality model has not been widely tested on the African soil despite its proponents' claim that it applies to all humans.For example, Heaven et al. (1994), and Heaven and Pretorius (1998) tried to recover the Big Five in different cultural groups, including a group of African ethnic origin, using a set of trait items meant to represent the five factors.These two studies suggested that the Big Five were less well recoverable in the group of African ethnic origin.Heuchert et al. (2000) factored the 240 items of the NEO-PI-R using data from a multicultural sample of South African college students, and found no reason to conclude to a different structure.Zecca et al. (2013) reported on the use of the NEO-PI-R in French-speaking African countries and Switzerland, and concluded that the five-factor model replicated fairly well in Africa.Schmitt et al. (2007) studied the five-factor model in 10 regions of the world that included 56 nations, among which Africa was represented by seven nations (Botswana, D.R. Congo, Ethiopia, Morocco, South Africa, Tanzania, Zimbabwe).They showed that the five dimensions could be well replicated in other languages and cultures, but also reported about some different tendencies for African countries.The African countries showed lower congruencies than other regions, in comparison to a US-structure.More over, weak scale reliabilities for the African region were found (from .55 to .68) in comparison to other regions (from .70 to .79 across all regions).
Personality measures in Africa are scarce; this is not only true for measures devel oped on African soil, measures proceeding from emic studies, but also for measures that were developed elsewhere and translated for use in African languages (etic).In studies, for example, where Big Five measures, developed in Europe or in the US, were adapted for local use, and in which global applicability of the Big Five factors was investigated, African cultures have been underrepresented (Rossier et al., 2005;Rossier & Rigozzi, 2008).
For our purposes, we used the Big Five Inventory (BFI; John et al., 1991), a measure that has actually been used just a few times in Africa (see Schmitt et al., 2007).

The Big Five Inventory (BFI)
The BFI (BFI; John et al., 1991) is a non-commercial 44-item instrument that takes about 10 minutes to fill out.It was constructed to measure human personality based on the Big Five model.The BFI consists of short phrases based on personality trait-adjectives considered to be prototypical of the Big Five (John et al., 2008).It uses a 5-point scale where '1' stands for 'disagree strongly' and '5' stands for 'agree strongly', and it is freely available for use in research.Over the years the BFI has been translated for use in various languages in the world.These translations and adaptations of the BFI include for example Italian (Fossati et al., 2011;Ubbiali et al., 2013), Spanish (Benet-Martinez & John, 1998), Chinese (Carciofo et al., 2016;Leung et al., 2013), Turkish (Karaman et al., 2010), and Dutch (Denissen et al., 2008).A notable inclusion into this list of translations and adaptations is the Tsimane version (Gurven et al., 2013), a small-scale forager-horti culturalist community in central lowland Bolivia.In applied research the BFI has been used in various settings such as education (Patrick, 2011;Wagerman & Funder, 2007), language use (Lee et al., 2007), and a clinical setting (Paine et al., 2009).Though most of these versions of the BFI have returned an intact factor structure and high internal consistencies of the five-factor scales, some language adaptations, particularly those from developing societies, have either failed to support the Big Five personality structure, or have had to undergo significant modifications of the scale items in order to achieve an acceptable fit and satisfactory reliability scores (see e.g., Gurven et al., 2013;Leung et al, 2013).
The BFI has been used very few times in Africa as a personality measure and even in these few instances either the original English version, or a French adaptation, was used.For the seven African nations in the 56 nations' study of Schmitt et al. (2007), the BFI English version was used in six countries while for the D.R. Congo a French version was used.For Kenya, and more generally for Swahili speaking countries, the BFI was not yet available.

The Swahili Language
Swahili, also referred to as Kiswahili, is the most widely spoken Bantu language in Africa.It is the mother tongue of the Swahili people, who live along the east African coast stretching from southern Somalia all the way to northern Mozambique (Appiah & Gates, 1999;Encyclopedia Britannica, 2005).Swahili belongs to the Benue-Congo branch (to which also Shona and Zulu belong) of the Niger-Congo language family.Although only about 15 million people speak Swahili as their mother tongue, Swahili is spoken as a fluent second language by more than 100 million people, most of whom are found in eastern and central Africa (Appiah & Gates, 1999;Encyclopedia Britannica, 2005).Swahili is a national and/or official language in Tanzania, Kenya, Uganda and the Democratic Republic of Congo, and also one of the official languages of the African Union.
In the current study, therefore, we aim to adapt the BFI for a Swahili speaking community in eastern Africa.This paper will thus report on the translation of the Big Five Inventory into the Swahili language.It will also report on the evaluation of the psychometric properties of the Swahili version of the BFI, as well as on the original English version, since English is a common language to Kenyans.

Method
The study consisted of two phases, the translation phase (from English to Swahili) and the psychometric evaluation phase (for both the Swahili and English Versions).The adaptation of the BFI to Swahili took part in the context of a larger project in which also the Dutch Big Five inventory, the FFPI (Hendriks et al., 1999), was adapted to the Swahili context, and a Swahili language taxonomic study was also undertaken following the psycho-lexical procedure in order to arrive at an indigenous Swahili trait structure that is optimal for the Swahili context, and for development of an indigenous personality inventory.

The Translation Phase
In order to achieve construct validity, the translation followed a multi-step process.In the first step, two bilingual individuals translated the English BFI to Swahili, aided by two dictionaries, the English-Swahili dictionary and the Swahili-English dictionary, both published by Oxford University Press.This led to the first draft of the BFI Swahili version.In the second step, this first draft version was sent to two bilingual linguists in Kenya, unfamiliar with the original English version, to independently back-translate it, thus producing two English versions.The English original was also sent to one other Kenyan bilingual individual, a scholar in hospitality management, to independently translate it, resulting in a second draft of the BFI Swahili version.In the third, and final, step, a committee of four experts, including the first author and a linguist, was formed to scrutinize the two draft Swahili translations, the two back-translations of the first draft Swahili version, and the English original.This led to the production, and adaptation, of the final Swahili version (BFI swa ).Both the Swahili and English Items of the BFI are presented in Table A under appendix.

The Data Collection and Analysis Phase Participants
The sample (N = 200) consisted of 199 undergraduate students at Pwani University in Kilifi, Kenya, and one high school student.The participating students were picked conveniently, cross-sectionally, from the first year to the fourth year of study, and from two schools of the University, namely the school of humanities and social sciences and the school of health sciences.Of the 200 participants, 114 were female (57%) while 86 were male (43%).The mean age of the participants was 20.16 years (SD = 1.44; range = 16-25 years).All participants gave verbal consent before taking part in the study, and none of the participants was paid in order to take part in the study.Participation was purely voluntarily and it was not related to earning course credits for the participants.Ethics clearance was obtained from the Pwani University Ethics Committee.

Measures & Procedure
All participants had to fill out both the Swahili version of the BFI (BFI swa ) and the original English version of the BFI (BFI eng ).The two versions were paired with versions of the FFPI, but opposite in language, which was expected to minimize the effects of memory in responding to items in similar versions consecutively.A participant would fill out one questionnaire first and, when ready, hand it over to the researcher and be issued with the second questionnaire.Each participant was requested to generate a unique number (identifier) to be used to identify the two filled out questionnaires.The identifier had to take the form: instrument/language/3initials/year of birth/month of birth/date of birth.The entire process, from explanation to filling out both questionnaires, took about one hour.

Data Analysis
In order to test the replicability of the Big-Five dimensions as envisaged in the BFI, the researchers first conducted a Confirmatory Factor Analysis (CFA) on the data from both language versions of the BFI.
Reliabilities of the scales (for both language versions) was investigated using Cron bach's Alpha.Cross-language convergent and discriminant validity, based on the a pri ori internal structure of the BFI, was examined using exploratory factor analysis and cross-language correlations.In order to control for possible response style bias, the data was also corrected for acquiescence response tendencies using the ipsatizing procedures outlined by Soto et al. (2008).Exploratory factor analysis was conducted using Principal Component Analysis (PCA) with varimax rotations, on both the raw scores and the ipsatized scores, and extraction was limited to 5 principal components.Analyses were conducted using SPSS version 25, and Lisrel Version 8.80.

Inter-Correlations, Reliabilities, and Corrected Item-Total Correlations
The figures in Table 1 give the scale reliabilities for the two BFI language versions and the cross-language scale correlations.The reliabilities of the five scales were moderate with Cronbach's alpha ranging from .49(Neuroticism) to .73 (Conscientiousness) for the Swahili version, and from .54 (Openness) to .82(Conscientiousness) for the English version.The Cross-language scale correlations ranged from .45 (Openness) to .78 (Con scientiousness), and had an absolute mean (after fisher r-to-z transformation) of .62.The cross-language inter-items correlations between the English and the Swahili BFI ranged from -.64 to .62 with a mean value of .02.Table 2 displays the corrected item-total correlations for both the English and the Swahili versions of the BFI.Notably, the 'prefers work that is routine' (an Openness item) had negative correlations in both the Swahili and English versions, probably indicating that this item fits better in another scale, may be conscientiousness.

Factor Structure
The Confirmatory Factor Analysis (CFA) of the Big Five-factor model failed to return a good fit for both language versions.For the English version χ 2 (892) = 1989.61,RMSEA = 0.078, 90% CI [0.074, 0.083], CFI = 0.49, and for the Swahili version χ 2 (892) = 1603.35,RMSEA = 0.062, 90% CI [0.057, 0.067], CFI = 0.76.This was not surpris ing as Hopwood and Donnellan (2010) indicated that "personality trait inventories often perform poorly when their structure is evaluated with confirmatory factor analysis" (p.332).We then conducted an exploratory factor analysis using Principal Component Analysis with varimax rotation, on the a priori five-factor solution, on both the raw scores and the ipsatized scores, for each of the two language versions of the BFI.

The BFI_eng
A five-factor extraction of the un-adjusted raw scores of the English version accounted for 37% of the total variance but failed to return a good replication of the Big Five factors.
As it appears on Table 3, with the exception of the Neuroticism factor which is fully re plicated (albeit with some substantial secondary loadings on other factors), the resulting factor structure is distorted with no single Big Five factor unambiguously replicated.The Conscientiousness factor has 8 of its 9 items (89%) loading onto it, though with some secondary loadings into other factors, while the Agreeableness factor has 7 of its 9 items (78%) loading onto it.The extracted five factors from the ipsatized scores accounted for 32% of the total variance, and two of the Big Five factors, Conscientiousness and Neuroticism were, fully and unambiguously, replicated.

The BFI_swa
An extraction of five factors returned a 34% variance and 32% variance for the un-adjus ted raw scores and ipsatized scores respectively.As Table 3 shows, both the raw scores and ipsatized scores failed to conform to the expected Big Five structure.With the exception of the Conscientiousness factor where 8 of its 9 items (89%) substantially load onto it (for both the raw and ipsatized scores), all the other factors have less than 50% of their intended items loading substantially onto them.

Discussion
In this study we developed a Swahili version of the English-language Big Five Inventory (BFI), and we evaluated the psychometric properties of both the English and Swahili versions of the BFI in a student sample in Kenya.Fit Indices of CFA for the Big Five-factor model did not return a good fit in both language versions of the BFI for this sample.We examined reliabilities of the two versions of the BFI, and concurrent and construct validity in terms of the cross-language correlations and factor structure.
The scores produced moderate reliability estimates based on Cronbach's alpha for both language versions.The conscientiousness scale came out as the most reliable scale in both versions, and overall the Swahili BFI produced the weakest reliabilities.
The cross-language scale correlations were also low-to-moderate (with the exception of Conscientiousness which was above .70).The high correlations of the Conscientious ness factor are consistent with findings in other BFI studies like the US English original (John et al., 2008), the Spanish version (Benet-Martinez & John, 1998), the French version (Plaissant et al., 2005), the Turkish version (Karaman et al., 2010), and the Indonesian version (Wibowo et al., 2017).The low reliabilities in this study show a correspondence with the findings in the Schmitt et al. (2007), where the African sample's scores ranged from .55 to .68.This could be an indication that items for similar scales were interpreted differently for each language, meaning the two languages elicited different responses to similar items.And this could also explain the low-to-moderate cross-language interitems correlations.These results are pointing to a possibility of language emerging as an influencing factor on the treatment of the items.
Another possible explanation for this could be based on the difference between the structures from the two languages.Whereas in English people's attributes are explained using adjectives, in the Swahili language nouns are mostly used (Garrashi et al., 2022).This is evident in the Swahili version of the BFI where adjectives are appearing in only two items while the rest of the items are worded in nouns.The Garrashi et al. (2022) study has shown that Swahili has a very limited number of adjectives and conversations of people's behaviors and attributes are mostly done using type and attribute nouns, and to a lesser extent with verbs.
The fact that a similar study using an African-American sample (Worrell & Cross, 2004) returned high reliabilities of .72 to .83, while two studies (the current study and Schmitt et al., 2007) that used an African sample returned low reliabilities, could also be an indication that the African population has a different conceptualization of personality traits; a possible socialization effect at play.
Like in the studies by Soto et al. (2008) and Rammstedt and Farmer (2013) we also adjusted our scores for acquiescence response style tendencies for both language ver sions.But unlike Rammstedt and Farmer's (2013) study which achieved full replication of all the Big Five factors, using ipsatized scores, our study only managed to replicate Conscientiousness and Neuroticism using the ipsatized scores of the English version of the BFI.The Swahili version data sets (raw and ipsatized) returned distorted factor structures.
While a good replication of Conscientiousness and Neuroticism has been achieved us ing ipsatized scores of the English BFI, both the raw and ipsatized Swahili BFI managed only a moderate replication of the Conscientiousness factor.
This failure to replicate the Big Five factors does not really surprise us because in our psycholexical study of the Swahili language (Garrashi et al., 2022), those five factors did not emerge.We instead extracted a Six-factor structure that correlates poorly with the Big Five factor model.We can explain the recovery of the Conscientiousness factor to the fact that the participants were university students who had worked hard in their education and passed their Kenya Certificate of Primary Education examination and their Kenya Certificate of Secondary Education examination which shows they are a highly disciplined and conscientious group.This may make it easy to self-identify and relate with the conscientiousness items.
We can explain the replication of Neuroticism from an Ubuntu perspective.Since Ubuntu emphasizes social relationships and caring of other people's feelings and needs, then emotional stability is very important.It is important to pay attention to people's emotions so that you don't hurt other people's feelings.Compassion and emotional well-being are key factors in Ubuntu.
It is worth mentioning that all our respondents were university students who under stand both English and Swahili well because these two languages are both national and official languages in Kenya.Therefore, poor comprehension of the questionnaire items could not be cited as a possible reason for the poor replication of the Big Five factors in this population.While some studies have reported on the effect of education on the replication of the FFM (Rammstedt & Farmer 2013;Rammstedt et al., 2010), where highly educated respondents returned a very clear Big Five structure, our sample of university students failed to replicate the structure with both raw and ipsatized data sets.
Unlike in the Benet-Martinez and John (1998) study where they did not see sufficient evidence for substantial Latin-US cultural differences in personality structure as defined by the Big Five, in the present study we could be experiencing culture playing a role in the poor replication of the five factors.This present study also slightly deviates from the Piedmont et al. (2002) where they had a strong-to-borderline replication of the five factors from both the English and Shona versions of the NEO-PI-R.The findings in the present study resemble the findings of Laajaj and Macours (2018) on Measuring Skills in Developing Countries, where an analysis of 23 survey data sets (including that of Luo farmers in Kenya), failed to return the Big Five factor structure.Our study can also be compared to the Gurven et al. (2013) study among the Tsimane forager-farmers of Bolivia, where the researchers failed to find a robust support for the Big Five personality structure.Therefore, like Gurven et al. (2013) study, the present study shows that "despite the increasing consensus supporting the five-factor structure model, the model does not robustly emerge everywhere" (Gurven et al., 2013, p. 354).
Thus the failed replication of the ubiquitous five-factor personality model in this current study could be a pointer to the presence of a culture-specific way of conceptual izing personality, which the BFI is not able to capture.It is our feeling that since the BFI is a European/American tool, what it is projecting are WEIRD conceptualizations of personality which are alien to this community, and cannot be captured using an etic-driven approach.Laajaj et al. (2019) points out that "...commonly used personality questions generally fail to measure the intended personality traits and show low validity, in low-and middle-income countries" (p. 1) hence the results of the present study are a confirmation of that.Indigenous generated instruments are therefore recommended.

Limitation and Conclusion
The main limitation of this present study was that all the participants were of a similar age, and were drawn from one public university, and it may be difficult to generalize the findings to all East Africans.Therefore, a follow up study involving participants from throughout the eastern Africa region (Kenya, Tanzania, Uganda, Rwanda, Burundi, and the DRC) where Swahili is widely spoken is necessary.
In this current study we looked at the reliabilities and factor structures of the English and Swahili versions of the BFI.We obtained low-to-moderate reliabilities and cross-lan guage correlations indicating a weak-to-moderate correlation between the two languages versions.We also did an exploratory factor analysis of both the raw and ipsatized data sets of the English and Swahili versions of the BFI.With the exception of the Conscien tiousness factor that was retrieved using the ipsatized data of the English version and Neuroticism that was retrieved using both raw and ipsatized data of the English version, the Big Five-factor model, as envisaged in the BFI, failed to emerge unambiguously for both versions of the Big Five Inventory.Therefore, a second look at the instrument, a modification of items, and an eastern Africa cross-national validation of the instrument, have been recommended.Another recommendation is the generation of emic-driven indigenous instruments.

Table 2
Corrected Item Total Correlations (Scales Based on the Original Structure of BFI)

Table 3
Varimax-Rotated PCA Factor Structures of Both the English and Swahili Versions of the Big Five Inventory Items Based on Raw and Ipsatized Scores

Table A
Items in Swahili and English