Bourdieu’s theory of cultural reproduction is one of the most influential sociological theories in the recent past. However, it was formulated in a time when music from vinyl records was not only heard for reasons of distinction and books were only available as print editions. As digitization progresses, the question arises: What are we missing by not asking about digital objects when using the well-established indicator of number of books at home?
The main interest of Bourdieu’s theory of cultural reproduction lies in explaining the process of intergenerational class (im)mobility. Bourdieu introduces three forms of capital, whose socially unequal distribution can be used to explain social reproduction: Economic, social, and cultural capital (Bourdieu, 1973, 1984, 1986; Bourdieu & Passeron, 1990). According to Bourdieu, the cultural capital of households is particularly important for the transmission of educational privileges (see also Engzell, 2021; Paino & Renzulli, 2013). He assumes that higher social classes have more cultural capital, which they transmit to their children during primary socialization, thus producing advantageous starting conditions in education (Bourdieu, 1986, p. 244). How exactly this transmission comes about remains somewhat unclear (Jæger, 2022, pp. 124–125). The aim of this article is to contribute to the validity debate of the indicator “books at home” as a measurement of the cultural capital of a household.
The Concept of Objectified Cultural Capital
Validity refers fundamentally to the extent to which statements can be regarded as valid or true in the light of empirical evidence (Shadish et al., 2002, p. 34). A measurement can be considered valid if a concept is captured in its entirety, is clearly delimited from adjacent concepts and if the measurement is as unbiased as possible.
Following Bourdieu, cultural capital is a multidimensional concept. He distinguishes between three forms of cultural capital: institutionalized, incorporated, and objectified cultural capital (Bourdieu, 1986, pp. 243–247). The term institutionalized cultural capital refers to publicly recognized and institutionalized educational certificates. These certificates are convertible into professional positions on the labor market and, thus, different incomes (economic capital). Incorporated cultural capital refers to embodied skills and knowledge acquired during the processes of family and school socialization. Finally, objectified cultural capital comprises material goods to which a cultural value is attributed, such as artworks or books.
Although Bourdieu’s concept of cultural capital is very popular in research, it is not without criticism. It has been pointed out that the definition of legitimate cultural capital is highly context-specific (Lamont & Lareau, 1988) and that the mechanisms of intergenerational transmission remain unclear (Jæger, 2022, p. 122; van Hek & Kraaykamp, 2015; Lareau, 2003).
At the measurement level, there are numerous indicators attempting to capture the various forms of cultural capital. Typically, educational certificates or the number of years of schooling are used to measure institutionalized cultural capital (e.g., Rössel & Bromberger, 2009). Incorporated cultural capital is often measured via cultural practices assumed to be associated with the acquisition of cultural knowledge, such as attendance of cultural events outside the home or practices within the home, such as reading (e.g., Börjesson et al., 2016; Kraaykamp & van Eijck, 2010). Objectified cultural capital is captured through the measurement of cultural possessions. While the number of books at home is frequently used to measure objectified cultural capital (Sieben & Lechner, 2019, p. 2), some research captures cultural possessions more comprehensively (e.g., antique furniture and music instruments; Kraaykamp & van Eijck, 2010).
The number of books at home, however, is one of the most, or even the most commonly used indicator in research on educational inequality (e.g., Hanushek & Woessmann, 2011, p. 117) and has been shown to correlate positively with other indicators of cultural capital (e.g., Sikora et al., 2019; Evans et al., 2010; Goßmann, 2018). The indicator’s prominence can be partly explained by its parsimonious measurement, as it only takes one item to be included in a questionnaire. Another factor contributing to the indicator’s pervasiveness is its broad conceptual meaning. It serves as a proxy for the total volume of cultural capital in a household as well as for the conditions of cultural socialization of children living in a household. It is further a proxy for domain-specific cultural capital in the literary field. This broad meaning of the indicator is a strength, as it can be considered a catch-all solution for the aforementioned concepts. But, at the same time, it implies a conceptual weakness (Jæger, 2022, p. 122; van Hek & Kraaykamp, 2015; Lareau, 2003) and raises the question what the number of books in a household is a proxy for (Engzell, 2021). In a paper published in Measurement Instruments for the Social Sciences, Sieben and Lechner (2019) assess the validity of the indicator “number of books in the household”. They conclude that it is valid for capturing the objectified cultural capital but should not be considered an indicator for cultural capital in general, as they find only small correlations of the indicator with cultural and literary activities.
With the digital transformation of society, the question arises as to how valid the traditional indicator of the number of printed books at home is in the context of the mass distribution of e-books. Between 2011 and 2021 the proportion of Americans who say that they have read an e-book in the past 12 months rose from 17% to 30% (PEW Research Center, 2022). A similar trend can be observed in Germany, where the share of e-book sales in total book sales rose from 0.5% in 2010 to 6.0% in 2022. However, it should be noted that e-book sales have stagnated since 2020 and the number of e-book buyers even decreased between 2013 (3.4 million) and 2022 (3.0 million). It is mainly the younger people who read e-books (Börsenverein des Deutschen Buchhandels, 2023). This last finding is in line with the literature on the diffusion of innovations (Rogers, 2003) and the digital divide (van Dijk, 2005). Although e-books are sold more cheaply than printed books or are even available for free, this literature suggests that particularly younger and more educated individuals make use of digital technologies. If some social groups digitize their reading habits while others do not or only to a lesser extent, a validity problem arises for the traditional indicator of the number of books at home: We then underestimate the objectified cultural capital of some groups. Accordingly, we address the question of how valid the traditional indicator “number of books at home” is in times of digitization and investigate whether an indicator that combines digital and printed collections yields substantial advantages over the traditional indicator.
We do so by distinguishing between the convergent and the discriminant validity (Campbell & Fiske, 1959). A concept has convergent validity if there are high correlations between measures of the same construct. In our case, this implies positive correlations with other literary practices. A concept has discriminant validity if measures of the same construct correlate stronger with each other than with measures of a different construct. Books contain written knowledge and make it possible to pass over knowledge of all kinds. Hence, we can speak of discriminant validity if correlations with practices in other cultural fields are less strong than with practices in the literary field.
Method
Data
We use data of the first wave of a panel study on cultural participation of the general population in Germany conducted in 2018 (“Cultural Education and Cultural Participation in Germany”), available at the GESIS data archive (Otte et al., 2024). The dataset contains rich information on leisure activities, cultural practices, cultural socialization, and comprehensive socio-demographic information. It includes extensive indicators that can be used to examine the question of convergent and discriminant validity, which makes the data set more suitable for the specific question than the use of other available data (e.g., PIAAC, SOEP, FReDA). The target population were all individuals living in private households who were aged fifteen years or older and had sufficient German language skills to take the interview. A two-stage random sampling procedure was applied: In the first step, a geographically stratified random sample of 183 municipalities was drawn. In the second stage, random samples of inhabitants aged 15 years or older were drawn from these municipalities’ registration offices. This procedure is most common among scientific surveys in Germany that intend to collect high-quality data in a face-to-face (CAPI) mode. Overall, 2,592 respondents were successfully interviewed. The response rate was 22.9%. More detailed information on sampling and data collection is given by Prussog-Wagner and Sandbrink (2019, p. 25).
Variables
Before we present the results of our analyses, we elaborate on the indicators used in the analysis in the following section. In a first step, we discuss our measurement of physical and digital book collections and explain the construction of a common indicator. As variables for evaluating convergent validity, we use items that represent literary practices, institutionalized cultural capital, and cultural socialization. These should be highly correlated with the book collections, especially the practices in the literary field. To evaluate discriminant validity, we draw on indicators that capture cultural activity in a field other than literature: the field of music.
Physical, Digital, and Combined Book Collections
To measure the current physical book collection, respondents were asked (translated from German): “Give us your guess: How many books do you own? Consider all types of books, except e-books.” The response options correspond to those used in the German National Educational Panel Study (NEPS) as well as PIAAC: (1) “None or very few (0–10 books)”; (2) “Enough to fill one shelving board (11–25 books)”; (3) “Enough to fill several shelving boards (26–100 books)”; (4) “Enough to fill one small shelf (101–200 books)”; (5) “Enough to fill a large shelf (201–500 books)”; (6) “Enough to fill a wall of shelves (>500 books)”. Respondents were provided with a show card with all options including a pictorial representation of bookshelves of different sizes to help them estimate the size of their collection.1
The respondents were then asked to specify the size of their digital book collection: “Approximately how many e-books, i.e., digital books saved as files, do you currently own?” This question was posed openly without an additional show card and the respondents’ information was stored in the form of a count variable.2
From both statements, we created a variable that represents the total size of the book collection (hereafter “combined book collection”). Because of the different response formats, we assigned each category of the physical book collection its respective midpoint.3 The values of both collections were then added.
Domestic and Non-Domestic Literary Practices
Regarding practices in the literary field, we use data on the number of books read, the number of public readings attended, and the number of visits to libraries (reference period last 12 months respectively). In addition, we consider (amateur) self-production: Respondents were asked how often they wrote literary texts such as poems or short stories (“daily”/“at least once per week”/“at least once per month”/“at least once per year”/“less often, but regularly for a while in the past”/“never”).
Cultural Socialization
In the case of cultural socialization, we also use indicators in the literary field, namely the physical book collection in the parental home when the respondent was 12 years old. The same response scale was used as for the current book collection. Likewise, we consider an active form of reading socialization in childhood: We form a mean index of four items that include (1) the frequency of talking about books in the family, (2) how often the respondents were read to by their parents, (3) how often they received books as gifts, and (4) how often specific books were recommended by their parents.
Institutionalized Cultural Capital
Institutionalized cultural capital is, first, represented by the highest educational attainment of both parents using 6 categories.4 Second, it is measured by the respondent’s own degree. Here, 9 categories are distinguished, which differentiate tertiary degrees in more detail compared to parental education.5
Domestic and Non-Domestic Musical Practices
The extent to which cultural capital, measured by possession of book collections, is related to practices in other cultural domains is explored using musical activities. For receptive activities, we look at the frequency of listening to music using different music media (CD/LP/MC/radio/streaming/mp3/TV) in the last four weeks and the frequency of attending concerts, distinguishing between concerts of classical music/opera and popular music concerts (last 12 months). Similar to amateur writing, the frequency of singing or playing an instrument is considered, as well as the total time in years that the respondent was guided by others in making music.
Results
We first present descriptive findings on the distribution of book collections in physical and digital form (Table 1, Figure 1) and the correlation of a combined indicator with physical and digital collections (Figure 2). Correlations of physical, digital, and combined book collection with practices in the literary field (Figure 3), institutionalized cultural capital, and cultural socialization (convergent validity, Figure 4) as well as practices in another cultural domain (discriminant validity, Figure 5) are then presented. In each case, the correlations are reported for the full sample and for three age groups. We distinguish between three age cohorts, as age is a key predictor of participation in digital information and communication technologies (ICT) in the literature on the digital divide (e.g., Sawert & Tuppat, 2020). While the youngest group, 15 to 35-year-olds, have grown up with ICT as digital natives, this does not apply to the oldest group, over 55-year-olds. These are often referred to in the literature as digital immigrants, as they have to actively adapt to the new technologies (e.g., Ballano et al., 2014).
Table 1
Physical, Digital, and Combined Book Collections
| Variable | Age groups | Total sample | ||
|---|---|---|---|---|
| 15–35 years | 36–55 years | ≥ 56 years | ||
| Physical book collectiona | ||||
| 0–10 books | 20.7% | 9.1% | 7.7% | 11.8% |
| 11–25 books | 22.0% | 8.8% | 10.2% | 13.0% |
| 26–100 books | 29.4% | 27.0% | 21.6% | 25.6% |
| 101–200 books | 16.3% | 21.9% | 19.5% | 19.4% |
| 201–500 books | 9.1% | 21.6% | 23.0% | 18.7% |
| > 500 books | 2.5% | 11.6% | 18.0% | 11.5% |
| Physical book collectionb – auxiliary variable using midpoints | 98.7 (143.3) | 214.5 (227.5) | 260.6 (258.7) | 200.1 (230.7) |
| Digital book collectiona | ||||
| 0 e-books | 69.0% | 66.4% | 79.7% | 72.2% |
| ≥ 1 e-book | 31.0% | 33.6% | 20.3% | 27.8% |
| Digital book collection including “0 books” b | 16.6 (89.7) | 41.6 (383.9) | 15.7 (109.5) | 24.7 (238.4) |
| Combined book collectionb | 115.2 (178.8) | 256.0 (447.06) | 276.3 (285.8) | 224.7 (335.4) |
| N | 714 | 871 | 988 | 2573 |
Note. Absolute total missing values: 19.
aCategorical variable; shown are column percentages. bCount variable; shown are arithmetic means and standard deviations in parentheses.
Figure 1
Comparison of Digital and Physical Book Collections
Figure 2
Correlations of Physical, Digital, and Combined Book Collections
Figure 3
Correlations With Literary Practices
Figure 4
Correlations With Institutionalized Cultural Capital and Cultural Socialization
Figure 5
Correlations With Musical Practices
As digital transformation is a dynamic phenomenon, we conducted a re-analysis based on the refreshment sample from the second wave from 2021 to validate our findings. For several reasons, we only used the data from 2021 for the re-analysis and not for the main analysis. One reason was that activities outside the home were greatly restricted due to the coronavirus pandemic. Another is that some variables were no longer collected in 2021. Finally, the sample size of the 2021 wave is substantially smaller than in 2018. The re-analysis with the more recent data, which is documented in Appendix B (see Balzer et al., 2025), does not come to any different conclusions.
Descriptive Results
Table 1 shows the inventory of physical and digital book collections in the households. For the traditional indicator of physical collections, it shows that the mode is 26–100 books, and overall, only 11.5% of households have more than 500 books at home. In addition, there is an age effect: the book collection becomes larger with advancing age. For digital book collections, we first consider whether households own at least one e-book at all. This is the case for a total of 27.8% of respondents. While the proportion is not very different between 15–35-year-olds and 36–55-year-olds (31.0% and 33.6% respectively), it is considerably lower among respondents aged 56 and older (20.3%).
When examining the original count variable, the youngest and oldest respondents are about even with 16.6 and 15.7 digital books, respectively. By a wide margin, the middle cohorts have the largest collections of digital books with an average of 41.6 e-books.
The size of the combined collection is largely determined by the number of physical books. The mean values increase monotonically with increasing age, with the greatest dispersion being found among people in the middle age category, as it is also the case with the digital collection.
Figure 1 shows that the physical collection is larger than the digital collection in all age groups. Not differentiated by age, this applies to 84.1% of households. Differentiating between age groups shows that the proportion of those who have a larger physical than digital collection is lowest in the youngest cohort (73.3%) and highest in the oldest cohort (90.4%). For 23.3 percent of the youngest cohort, both collections are of the same size. This is considerably less common in the two older cohorts. Owning a larger digital than physical collection is an exception in all age groups. Accordingly, although there is a slight age effect overall, the structures are fundamentally the same. This age effect can potentially be explained by the findings presented in Table 1, according to which younger cohorts have collected fewer books due to their shorter lifetimes to date.
Correlational Analysis
As described in the previous section, a common indicator for the size of the entire book collection, physical and digital, was created from the data on the size of the physical and digital collections. In Figure 2, we look at the correlations of this combined indicator with the physical and digital book collections. As an additional aid for orientation, in all further presentations of results, cells indicating the respective correlation coefficients are color-coded (“heatplot”), with the color spectrum ranging from greenish-gray (no or very weak correlation) through yellow (moderate correlation) to “hot” red (very strong correlation). The most important finding is the rather weak correlation of the digital collection with the physical collection in all age groups. The combined indicator shows a very high correlation with the physical book collection, which is primarily due to the fact that most households predominantly own physical book collections, as shown in Figure 1.
Analysis of Convergent Validity
Crucial to the validity of the number of books indicator, however, is whether digital collections represent the same construct as physical collections, or rather a different construct or a different dimension of the same construct. To examine convergent validity, we look at the correlations of the three indicators with literary practices, institutionalized cultural capital, and cultural socialization in Figure 3 and Figure 4.
For literary practices, the highest correlations can be observed for the outcome reading books, all other practices correlate less with physical collections. Compared to physical collections, the correlation of digital collections with reading books is considerably lower, and the combined indicator barely correlates more strongly with the outcome than the traditional indicator.
For institutionalized cultural capital and cultural socialization, a similar picture emerges. For respondents’ education, the parental book collections as well as the reading socialization, the correlations with the physical collection are around 0.3, and the digital collection only reaches half of this size. Here, the combined indicator performs slightly better and seems to capture a little more than the physical indicator alone, but again the additional gain is not substantial. We do not find any strong age effect either. The basic patterns are similar across all cohorts.
Analysis of Discriminant Validity
Finally, we consider the correlations of the three indicators with practices in the field of music to investigate discriminant validity. The correlation with listening to music is close to zero for all three variants of the book collection. Quite considerable correlations are found between attending classical concerts and the physical book collection, but not the digital book collection. The combined indicator does not contribute considerable explanatory power for any of the variables. The findings are evident for all age groups. Consideration of discriminant validity suggests that the physical book collection and the combined indicator can be understood not only as indicators of cultural capital in the literary field, but as a more generalized indicator of highbrow cultural capital. The indicator for the digital book collection, on the other hand, correlates weaker with practices in the literary field, institutionalized cultural capital and cultural socialization, and practically not at all with musical practices.
Discussion
Using survey data of the German population from 2018, we investigated whether we are missing something when using book collections as a measure of cultural capital without taking digital collections into account. We found that digital book collections do not yet play a formative role compared to print-collections. Only just under 30 percent of the population aged 15 and older owns e-books. More relevant, however, is the question of whether those people who own e-books replaced their physical collection with them. If this was true, we would systematically underestimate the size of the collection for some groups. Our finding indicates that this does not seem to be the case. It is also important to point out that the size of a person’s e-book collection is considerably less correlated with his or her reading frequency than that of the printed book collection, even for the youngest age group. This observation casts doubt on the convergent validity of the digital indicator. When the digital collection is considered in addition to the physical collection, the information gains are marginal empirically. This finding is in line with Heppt et al. (2022), who show that the e-book collection does not have any additional explanatory power beyond the physical book collection in explaining children's academic achievement.
We used data from a panel wave conducted in 2018, as they allow for the most comprehensive analyses of convergent and discriminant validity. As the data used in our analyses is already seven years old, we have repeated the analyses using more recent data from 2021. The results are not substantially different. Hence, we are optimistic that our main conclusions are meaningful. Nevertheless, it must be taken into account that the digital transformation is a dynamic process, and it is possible that a similar study will come to different results in five or ten years. This time restriction is supplemented by a spatial reservation. Compared to the USA, for example, e-books are comparatively less widespread in Germany. The question is to what extent the findings also apply to countries in which the digital transformation of society is more advanced. Although the literature on the digital divide and the diffusion of innovations suggests that social gradients should become weaker as mass adoption increases, whether this is confirmed empirically is a yet open question.
Although our study is only a temporal and spatial snapshot, we currently see no problem in only taking the number of physical books into account. However, we have mentioned literature arguing that the indicator is conceptually imprecise. According to our findings on convergent validity, it can be recommended as a rough indicator to record socio-cultural background. However, if one is interested in causal studies that investigate specific mechanisms, the indicator seems less suitable.
Our practical implication is that currently e-books only need to be measured for research questions about digital behavior and are not necessary for researching general questions on intergenerational (im)mobility. At the same time, we call for the validity of this conclusion to be examined regularly as digital transformation progresses.
This is an open access article distributed under the terms of the Creative Commons Attribution License (