Structural validity of the Brazilian version of the Sense of Coherence scale (SOC-13) in oral health research: exploratory and confirmatory factor analysis

Background The Sense of Coherence (SOC) construct has been used worldwide in oral health research, but rigorous factor analyses of the scale are scarce. We aim to test the dimensional structure of the Brazilian short version of the SOC scale with 13 items. Methods This study is a secondary analysis of four independent cross-sectional Brazilian studies on oral health, using the 13-items SOC scale. Sample 1 was conducted on 1760 mothers and 1771 adolescents. Sample 2 comprised 1100 adults. Sample 3 had 720 adults and older individuals. Sample 4 comprised 664 adolescent students. Confirmatory Factor Analysis (CFA) was conducted on sample 1 to compare two models: 3-factor versus 1-factor. Because they were refuted, Exploratory Factor Analysis was implemented in samples 2 and 3. Modified models were tested in sample 4 using CFA. All analyses were conducted with MPlus version 7.11. Results CFA of sample 1 resulted in an unacceptable fit (RMSEA = 0.12;CFI = 0.78; TLI = 0.73; and WRMR = 3.28) for 1-factor model and 3-factor (RMSEA = 0.10; CFI = 0.87; TLI = 0.84; and WRMR = 2.50). The EFA on samples 2 and 3 showed, respectively, two eigenvalues greater than 1 (4.11 and 1.56) and (4.32 and 1.42), but the scale items soc1, soc2 and soc3 formed an uninterpretable second factor. Another CFA, using sample 4, showed acceptable model fit after removing those three items and also soc11 (RMSEA = 0.05; CFI = 0.98; TLI = 0.99; and WRMR = 0.71). Conclusion The results indicate that the SOC-13 scale needs further adjustments. The one-factor model with nine items showed a good statistical fit, but the implications of excluding items should be further investigated, considering the scale's content validity, cross-cultural adaptation and theoretical background.

adversities and maintain their health [1]. Indeed, it has been shown that the way people deal with stressful events impacts their mental health, obesity, different types of cancer, cardiovascular diseases, and healthrelated behaviours [2][3][4][5]. Consistently, adolescents and their mothers with higher SOC have less dental caries, and higher SOC is associated with greater probability of having preventive dental appointments and less gingival bleeding [6][7][8]. Higher SOC levels in adolescents were associated with fewer decayed teeth in crude but not adjusted models [6], but a recent systematic review showed a strong effect on dental caries [9]. Higher SOC was also associated with more frequent toothbrushing, lower soda drink consumption, fewer decayed teeth, and fewer oral health-related impacts [10].
The SOC scale proposed by Antonovsky has been widely used in versions of 13 and 29 items [1,11]. Originally developed in English, it has been translated into 49 languages and used at least in 48 countries [11,12], indicating some face and content validity. Nonetheless, its factorial structure remains unclear. On the one hand, assuming one overall factor, 124 studies have shown an acceptable Cronbach's alpha ranging from 0.70 to 0.95 for the 29 items version, and 127 studies have shown values from 0.70 to 0.92 for the 13 item-SOC scale [12,13]. On the other hand, authors using factor analysis have reported 2, 3 and up to 5 factors [12,[14][15][16][17], suggesting it may be a multidimensional scale. The original three-factor configural model of SOC has been rejected in some studies, and a different arrangement (SOC-R), also with three factors, has been proposed [18][19][20].
The Brazilian version of the SOC scale used in the present study is an adaptation of the Portuguese version from Portugal for a study carried out among Brazilian adolescent students and their mothers in 1997 [6,8] and it has been widely used since then. However, it has not been subjected to rigorous factor analysis [21]. Factor analysis is essential to assess internal construct validity because it examines the underlying dimensions of the scale that the items are purported to represent. An exploratory factor analysis of another Brazilian version used in cardiac patients suggested a bad fit for the three-factor model and low loadings in some items, then authors concluded that one factor could be the best factorial solution [22]. Evaluating the factorial structure of the Brazilian version is essential also to assess the crosscultural adaptation process and may provide further support for future studies in other countries, as the scale is expected to have a similar structure. Thus, the objective of this study was to assess the dimensional structure of the Brazilian version of the Sense of Coherence scale with 13 items according to Antonovsky's theory.

The Sense of Coherence scale
Antonovsky's SOC scale, also named The Orientation to Life Questionnaire, has been used in its full (29-item) and short (13-item) versions [1]. Both measure three dimensions (comprehensibility, manageability, and meaningfulness), which are theoretically distinguishable but interrelated and, at some point, one can overlap each other conforming to one dimension. Antonovsky suggested that "it might seem possible to assign three separate sub-scores" [1] (page 87) as correlations among the three aforementioned dimensions in an Israeli sample were 0.45, 0.59 and 0.62, showing they are only moderately correlated. However, exploratory factor analysis carried out by Antonovsky found that the three components were not empirically separable and then he proposed using an overall score (one dimension). Difficulties in factor analysis to separate items in clear clusters may be due to the fact that Pearson's correlation, used at that time, underestimated the magnitude of correlations making all items similarly low and not able to distinguish different clusters. Currently, it is possible to run factor analysis using an adequate estimator for short ordinal items, that is, a polychoric correlation matrix.
Antonovsky described the three dimensions of the SOC. The first is "comprehensibility" (questions 2, 6, 8, 9 and 11). It is the cognitive component: "the stimuli deriving from onus's internal and external environments in the course of living are structured, predictable, and explicable". The second dimension is "manageability" (questions 3, 5, 10 and 13). It is the instrumental component: "the resources are available to one to meet the demands posed by these stimuli". The third dimension is "meaningfulness" (questions 1, 4, 7 and 12), the motivational component: "these demands are challenges, worthy of investment and engagement" [1]. It is considered the most important SOC component. The specific questions of SOC-13 used in this study are summarised in Table 1.

Data sources and participants
This study is a secondary analysis of four independent cross-sectional Brazilian studies on oral health-related outcomes, using the 13-item SOC scale. The first study (Sample 1) was conducted on 1760 mothers and 1771 adolescents to evaluate the impact of primary health care services in south Brazil in 2011 [23]. The second (Sample 2) comprised 1100 adults to investigate the influence of psychosocial factors on health conditions in south Brazil during 2006-2007 [24]. The third study (Sample 3) investigated the relationship between SOC and oral health in a sample of 720 adults and older individuals in south Brazil during 2008-2009 [25]. The fourth (sample 4) is a study on the association between adolescents' SOC and their oral health in 664 students in mid-west Brazil in 1997 [6]. Datasets 2 to 4 had items as ordered categories from 1 to 7, as proposed in the original version of the scale, while dataset 1 had all SOC items as ordered categories from 1 to 5.

Data analysis Sample 1-confirmatory factor analysis
Confirmatory factor analysis was conducted to test the dimensional structure of the 13-item SOC scale in two models. The first (M1) was derived from Antonovsky's postulated theory that the scale was designed to provide a single measure and the analysis consisted of a onefactor model. The second (M2) was based on the three SOC theoretical dimensions (comprehensibility, manageability, and meaningfulness), according to which specific items should load in each factor. A difference between M1 and M2 was tested using DIFFTEST (provided in the Mplus 7.11 software).
We applied the Weighted Least Square Mean and Variance Adjusted (WLSMV) estimator to analyse the SOC scale [26]. The overall goodness-of-fit of each model was evaluated using the comparative parameters provided by the software. Values < 0.05 for Root Mean Square Error of Approximation (RMSEA) and Standardised Root Mean Square Residual (SRMR) suggest close approximate (adequate) fit, whereas values > 0.10 indicate poor fit. The Comparative Fit Index (CFI) and the Tucker-Lewis index (TLI) represent incremental fit, whereas values > 0.95 are indicative of good fit. A Weighted Root Mean Square Residual (WRMR) value < 1.0 indicates a good fit. Internal consistency was assessed using McDonald omega coefficient for each factor separately using standardised items.

Samples 2 and 3-exploratory factor analysis
Results of the CFA of Sample 1 showed a poor fit, leading us to reject both models tested. Exploring Sample 1 to obtain a better fit would result in a model far from the postulated theoretically. Then, exploratory factor analysis was carried out in two other datasets to reassess the configural validity, that is, how items allocate when freely estimated. Such a step is considered an initial part of the internal validation process in the factor analysis [27]. Polychoric correlations were used throughout all analyses of all datasets for factor analysis. The extraction method was iterated principal factors. The number of factors was "Very seldom or never" up until "very often" Comprehensibility Soc02 "Has it happened in the past that you were surprised by the behaviour of people whom you thought you knew well?" "never happened" up until "always happened" Manageability soc03 "Has it happened that people whom you counted on disappointed you?" "never happened" up until "always happened" Meaningfulness Soc04 "Until now your life has had" "No clear goals and propose" up until "Very clear and purpose" Manageability Soc05 "Do you have the feeling that you're being treated unfairly?" "Very often" up until "Very seldom or never" Comprehensibility Soc06 "Do you have the feeling that you are in an unfamiliar situation and don't know what to do?" "Very often" up until "Very seldom or never" Meaningfulness Soc07 "Doing things you do every day is" "A source of pleasure and satisfaction" up until "A source of pain and boredom" Comprehensibility Soc08 "Do you have very mixed-up feelings and ideas?" "Very often" up until "Very seldom or never" Comprehensibility Soc09 "Does it happen that you have feelings inside you would rather not feel?" "Very often" up until "Very seldom or never" Manageability Soc10 "Many people-even those with a strong charactersometimes feel like sad sacks (losers) in certain situations. How often have you felt this way in the past?" "Never" up until "Very often" Comprehensibility Soc11 "When something happened, have you generally found that" "You overestimated or underestimated its importance" up until "You saw things in the right proportion" Meaningfulness Soc12 "How often do you have the feeling that there's little meaning in the things you do in your daily life?" "Very often" up until "Very seldom or never" Manageability Soc13 "How often do you have feelings that you're not sure you can keep under control?" "Very often" up until "Very seldom or never" defined using Parallel Analysis (PA) method (paran command in Sata with 1000 replications in a PCA) in combination with the theoretical interpretability of the factors. Furthermore, items were retained in a specific factor if their loadings were > |0.3| [27]. The current analysis used the geomin oblique rotation. Communality measures the common factor variance of an item and it equals 1-uniqueness (δ). A uniqueness of > |0.70| indicates that an item may be unreliable, while a value ≤ |0.70| indicates that the factor accounts for a large percentage of the item variance. Our next step was to evaluate Bartlett's test of sphericity and the Kaiser-Meyer-Olkin measure of sampling adequacy. Statistically significant p-values (p < 0.05) in Bartlett's test and measures greater than 0.5 in the Kaiser-Meyer-Olkin test indicated that we could proceed with factor analysis [28]. According to Hair et al. [28], at this stage, items with an adequacy value < 0.50 should be considered for exclusion. Internal consistency was assessed using McDonald omega coefficient for the whole scale using standardised items. All EFA analyses were performed in Stata 16.1 and MPlus version 7.11.

Sample 4-confirmatory factor analysis
In this dataset, we conducted the Confirmatory Factor Analysis (CFA) using results from previous EFA in Samples 2 and 3 to propose alternative models. All models departed from 1-factor solution. In Model 1, five residual correlations were added until one acceptable absolute and one relative fit indices were obtained. In Model 2, items with very poor performance were removed (items 1 and 11), and residual correlations were added to obtain an acceptable fit as previously described. In Model 3, highly correlated items (items 2 and 3) were removed, and additional residual correlations were added. In Model 4, items with poor performance (1 and 11) were removed as well as item 2 (highly correlated with item 3). Finally, in Model 5, items with high correlation and poor performance were removed (1, 2, 3 and 11).
Robust weighted least squares mean and variance adjusted (WLSMV) estimator was used [26]. The measurement errors (uniqueness) and loadings were calculated. The goodness-of-fit of the model to the data was evaluated using the ordinary comparative parameters provided by the software. The Root Mean Square Error of Approximation (RMSEA) is an absolute fit index and incorporates a penalty for poor model parsimony [26,28]. Values lower than 0.05 suggest close approximate (adequate) fit, whereas values equal or above 0.10 indicate poor fit suggesting that the model should be rejected. The Comparative Fit Index (CFI) and the Tucker-Lewis index (TLI) represent incremental fit indices. Both ranges from zero to one, and values > 0.95 indicate a good fit.
An overall conclusion about the fit of each model can be obtained by considering these indices simultaneously.

Confirmatory factor analysis (sample 1)
The sample was composed of school adolescents between 12 and 19 years of age (mean = 14.1 years and standard deviation ± 2.2) from 36 cities in southern Brazil. Their household income was between BRL 500.00 and 1500.00, with an average per capita income equivalent to BRL 829.27, a value above the minimum wage in the Brazilian state where data were collected. For the sample of mothers, the McDonald omega was 0.77 for the whole scale, and for factors 1, 2 and 3 it was, respectively 0.60, 0.55, and 0.63. For the sample of teenagers, the McDonald omega was 0.66 for the whole scale, and for factors 1, 2 and 3, it was 0.50, 0.49, and 0.54, respectively.
The outcome of the CFA is presented in Table 2 with the one-factor model (M1) and three-factor model (M2) in adolescents and adults separately. The one-factor analysis indicated low item loadings apart from SOC06, SOC08, SOC09 and SOC10 (loading > 0.60) in adults. Several items with low loadings were found in the threefactor model too. The CFA for the total score model did not indicate an acceptable fit ( Table 2)

Exploratory factor analysis (sample 2)
This is a sample from Sao Leopoldo city in southern Brazil and consisted of 1098 individuals. The study population consisted mainly of women (71.8%), white people (84%), aged between 20 and 49 years (mean = 44.3 years and standard deviation ± 15.8), and with 5-11 years of schooling (65.2%). The adjusted Eigenvalues from PA were 4.11, 1.56 and 0.91. Although two components were recommended to be retained, we extracted three models with 1, 2 and 3 factors to assess how items behave in different situations and because theoretically the scale should have three factors. They explained 72.4% of the shared variance and the addition of a third factor increased it to 79.2%. For this sample, the McDonald omega was 0.78 for the whole scale.
Based on the theory that SOC could have three factors, we extracted loadings up to three factors in the exploratory analysis using Mplus. The fit of the one-factor model showed unacceptable values: RMSEA = 0.17, CFI = 0.70, TLI = 0.64. The fit of the two-factor model showed acceptable values: RMSEA = 0.07, CFI = 0.97, TLI = 0.95; however, the first factor was not interpretable. The fit of the three-factor model showed acceptable values: RMSEA = 0.05, CFI = 0.98, TLI = 0.97 but the first and the third factors were not interpretable (Table 3).

Exploratory factor analysis (sample 3)
This is a sample from Porto Alegre, a capital city in southern Brazil and consisted of 720 individuals. The age range of the participants was 50-74 years (mean = 60.2 years and standard deviation ± 7.5), and they were predominantly female (57.8%), 26.2% earned two minimal wages or less monthly, and 29.8% had less than six years of study. The polychoric correlation matrix shows inter-item correlations used for exploratory factor analysis. The adjusted Eigenvalues from PA were 4.32, 1.42 and 0.86. Although two components were recommended to be retained, we extracted three models with 1, 2 and 3 factors to assess how items behave in different situation and because theoretically the scale should have three factors. They explained 74.0% of the shared variance and the addition of a third factor increased it to 79.6%. For this sample, the McDonald omega was 0.81 for the whole scale.
As in the previous sample, factors loadings we extracted up to 3 factors in exploratory analysis using Mplus. The fit of the one-factor model showed unacceptable values: RMSEA = 0.12, CFI = 0.84, TLI = 0.80 The fit of the two-factor model showed acceptable values: RMSEA = 0.04, CFI = 0.98, TLI = 0.98; however, the first factor was not interpretable. The fit of the three-factor model showed acceptable values: RMSEA = 0.04, CFI = 0.99, TLI = 0.98 but the first and the third factors were also not interpretable. The onefactor model, SOC01, SOC04 and SOC11, showed a low loading (< 0.40). The model with two and three factors showed a good fit, but the results differed from the theory, and the factors were uninterpretable (Table 3).

Confirmatory factor analysis (sample 4)
This is a sample from Goiania city in Middle-West Brazil. It consisted of 664 adolescents, all 15 years old (344 female and 320 male), whose mothers were predominantly from low socioeconomic status (60.7%) and had low levels of schooling (36.8%). For the sample of adolescents, the McDonald omega was 0.78 for the whole scale (one factor). Based on previous results from studies 2 and 3, we decided to exclude some items and test five different onefactor models using CFA. The results indicated high item loadings (Table 4) except for the item SOC01 and SOC11 (loading < 0.40). All alternative models yielded good fit indices, but the best fit was obtained for a model without 4 items (SOC01, SOC02, SOC03 and SOC11) with RMSEA = 0.05, CFI = 0.98, TLI = 0.97 and WRMR = 0.71 (Table 4).

Discussion
The present study investigated the configural structure of the Brazilian version of the SOC-13 used in oral health studies and we could not confirm the initial hypothesis that either one or three theoretical dimensions would have an adequate fit. To the best of our knowledge, this is the most comprehensive study on the configural validity of SOC-13, at least in Brazil. A modified model, removing four items (SOC01, SOC02, SOC03 and SOC11), showed that the one-factor model had a more parsimonious structure with acceptable fit indices. Other alternative models yielded an adequate fit when several additional residual correlations were implemented.
One of the assumptions in this study was that the SOC-13 had a cross-cultural equivalence for the Brazilian population [21], but some items had very low loading in one-factor model and were removed to achieve an acceptable fit. In line with our study, an acceptable fit was achieved in the Norwegian version only after dropping items SOC02 to SOC04 and SOC11 [13]. A systematic review on cross-cultural adaptation to Brazil reported that individuals had difficulties in understanding items SOC01, SOC06 and SOC11 [21]. Despite a consensus in the literature regarding the face and content validity of the scale, we observed that researchers from other countries also pointed out similar psychometric problems concerning those items and a factorial structure different from the originally proposed, suggesting that some items Table 3 Exploratory factor analysis of the SOC-13 scale in adults and elderly of two south Brazilian municipalities.  may be culturally sensitive. International studies reported that some items loaded in different factors than those theorised [16,29]; therefore, rejecting the original threefactor model. Interviewees encountered difficulties in answering and interpreting SOC02 and SOC03 and considered those items as addressing the same question [15]. This may explain why SOC02 and SOC03 were reported to have a higher residual correlation than any other pair of items in the scale [16,30,31], forming a factor on their own. Our final CFA supports a modified one-factor model after removing four items. Differently, a previous study, performing CFA on SOC-29 [32], showed that the threedimension model presented a better fit than the one dimension [33]. Konttinen argued that SOC-13 should have three distinct dimensions [34], but this and other studies have also failed to confirm those dimensions [13,[15][16][17]34]. Notably, a short instrument (13 items) may not be able to capture three distinct factors, and this fact can be relevant if dimensions are theoretically correlated and items have cross-loadings. Not surprisingly, the three conceptual dimensions have never been reported in previous confirmatory factor analyses of the SOC-13 scale [12,35].
The first version of the Sense of Coherence scale consisted of 29 items, and an abridged version with 13 items was later developed by Antonovsky [1,35]. However, methodological details on the development of this version have not been identified in the literature. If a golden standard is not available, the best approach to shorten a scale is to combine an expert-based approach with statistical techniques [36]. In addition, for multidimensional scales, each sub-scale should be subject to a specific shortening process to ensure that it will keep the same dimensional structure. An abridged version that mixed up dimensions may be less valuable than the original scale, but it became popular because it is helpful in large surveys to tap an overall dimension. This study has some strengths and limitations. Although we have four large samples from different parts of Brazil, results can only be generalised with caution. Nonetheless, our findings were similar to other countries with different cultural backgrounds [16,[29][30][31]. Importantly, our four databases were large, with sample sizes between 664 and 1767 individuals, reducing the chances of random error regarding items loading and other psychometric issues [37]. We were also able to test the configural model of the SOC-13 based on a priori specification of two models for CFA and, after rejecting them, we could explore modifications using EFA in two datasets of different populations. However, removing items is a rather simplistic approach to solving the problem of a statistical misfit as there is growing evidence to suggest that the use of fit indices to define the number of factors and other issues is problematic in many ways [38]. We believe definite changes should be guided by theory and discussed among scholars.
In conclusion, our results indicate that the Brazilian version of SOC-13 scale needs further adjustments. Several alternative models yielded a good fit, after modification in key issues. The one-factor model with nine items showed a good statistical fit and was the most parsimonious, but such changes need further discussion before being implemented in future studies and the theoretical background has to be considered as proposed by Antonovsky [1]. For example, removing items may affect the content validity [39]; thus, qualitative studies with respondents and experts are essential to warrant the theoretical validity of the cross-cultural adaptation process and possibly the shortening process from SOC-29 may be revised.