Cross-cultural adaptation of discrimination and vigilance scales in ELSA-Brasil

ABSTRACT OBJECTIVE To describe the process of cross-cultural adaptation for the use in Brazil of the everyday discrimination scale (EDS) and the heightened vigilance scale (HVS) applied in the Longitudinal Study of Adult Health (ELSA-Brasil). METHODS Conceptual, item and semantic equivalence analyses were conducted by a group of four epidemiologists; evaluation of measurement equivalence (factorial analysis of configural, metric and scalar structures, according to sociodemographic characteristics) and reliability. A total of 11,987 participants responded to the discrimination scale, and a subsample of 260 people participated in the test-retest study. In the case of HVS, 8,916 people responded, while 149 individuals did so in the test-retest study. RESULTS The scales presented conceptual, item and semantic equivalence pertinent in the Brazilian context, in addition to adequate correspondence of referential/denotative meaning of terms and also of the general/connotative of the items. The confirmatory factor analysis of EDS revealed a unidimensional structure, with residual correlations between two pairs of items, presenting configural and metric invariance among the four subgroups evaluated. Scalar invariance was identified according to sex and age group, but it was not observed for race/color and education. Heightened vigilance showed low loads and high residuals, with inadequate adjustment indicators. For the items of the discrimination scale the weighted kappa coefficient (Kp) ranged from 0.44 to 0.78, and the intraclass correlation coefficient (ICC) was 0.87. For HVS items, the Kp ranged from 0.47 to 0.59 and the ICC was 0.83. CONCLUSIONS Although there are correlated items, it was concluded that the EDS is a promising scale to evaluate experiences of perceived discrimination in Brazilian daily life. However, the heightened vigilance scale did not present equivalence of measurement in the current format.


INTRODUCTION
In recent decades, studies evaluating the association between discrimination and health have increased considerably 1,2 . Its results are part of the documentation of health disparities in the United States (US), due to the relationships established between experiences of discrimination and worse mental and physical health indicators 3,4 . Part of this evidence emerges from studies of interpersonal discrimination, in which experiences of hostility events in everyday life have received greater emphasis 5 , for being a measure of chronic exposure to psychosocial stressors 3,6 .
The perception of everyday discrimination involves unfair and recurrent practices in interpersonal interactions in different contexts and environments, including manifestations of disrespectful treatment, belittlement and offer of worse care or service 6 . The everyday discrimination scale (EDS) 7 is one of the most widely used instruments to assess racial/ethnic discrimination, especially in the US, but also in European countries, Canada, and South Africa 8 . Originally proposed in the context of the Detroit Area Study, to assess experiences and frequency of self-reported discrimination of racial/ethnic groups and its impact on health 7 , the scale attempts to capture subtler chronic or episodic aspects of interpersonal discrimination 2,4 .
The diffusion of EDS was favored by its brevity and psychometric qualities described in its first decade of use 1,2,7,9 , focused primarily on their performance in different racial/ethnic groups, such as African Americans 9 and Latinos in the US 10 , and among women [11][12][13][14] . These studies suggest good psychometric performance, but recommend that it be evaluated among heterogeneous social groups, which include racial, gender identity and social class diversity. In addition, they question whether the scale should be used to assess the general perception of discrimination, in addition to racial discrimination in which it was more appropriate [12][13][14] .
Discrimination-related vigilance has been emphasized as an important component of the association between experiences of discrimination and health events 15 . It is a coping mechanism that is characterized by the physical and mental preparation of the individual, with continuous monitoring of the environment and what happens around it, as well as constant re-adaptation, in order to protect oneself or avoid an experience of discrimination 15,16 . To assess this component of discrimination, the heightened vigilance scale (HVS) was also proposed for the Detroit Area Study to be used in sequence to the EDS, applied to those who responded having experienced previous experiences of discrimination 15 .
The confrontation associated with the HVS has been linked to health problems in the US 15,17 , such as stress 18 . In addition, it was also considered a potential mediator in the theoretical model that relates race/ethnicity and adverse health outcomes 16 . However, the knowledge about its psychometric properties is limited, since the measurement and analysis of the structure of its construct have been little explored 19 .
In order to study the effects of racial discrimination on health, both scales were included in the questionnaire of the third stage (wave 3) of follow-up in the Longitudinal Study of Adult Health (ELSA-Brasil). This article aims to describe the process of cross-cultural adaptation for the use of the EDS and the HVS in Brazil.

Conceptual, Item and Semantic Equivalences
After authorization of the scales' authors, we went to the stages of cross-cultural adaptation based on the recommendations of the literature 20 . https://doi.org/10.11606/s1518-8787.2022056004278 Conceptual and item equivalences were evaluated by a group of four epidemiologist researchers with previous experience in the use of scales and/or with the theme of racial discrimination. The process involved an extensive literature review of the use of the instruments, the previous psychometric performance and the relevance of the scales to the Brazilian context. Semantic equivalence involved four steps: 1) translation of the original instrument in English to Brazilian Portuguese, independently, by two experienced researchers fluent in English. The translators used a standardized form assigning a grade (between 0 and 10) to the degree of difficulty encountered in the translation. The translations generated a consensus version in Portuguese, made by the team of researchers, with the presence of the translators; 2) back-translation of the consensus version in Portuguese, performed by a native translator in English, who recorded comments and evaluated the degree of difficulty in back-translation in notes, also with variation between 0 and 10; 3) comparison of the original version of the scale with that elaborated after back-translation, evaluating the semantic equivalence of the two versions (original and back-translated), in order to ensure the transfer of the meanings of words in both languages. After adjustments and adaptations, corrected versions of the instrument were developed for pre-tests; 4) pre-tests and pilot study of the proposed versions (50 and 18 volunteers, respectively, with similar characteristics to the study population).

Measurement Equivalence
After applying the scales in wave 3 of ELSA-Brasil, the dimensional structure was evaluated -via exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) -and internal structure adequacy assessment, with examination of invariance of the configural, metric and scalar structures between subgroups of self-reported race/skin color, sex, age group and education.
The ELSA-Brasil is a prospective cohort that, in its baseline (2008-2010), enrolled 15,105 participants aged between 35 and 74 years of both sexes, active and retired employees of six Brazilian educational and research institutions, to monitor chronic health outcomes 21 . Wave 3 of ELSA-Brasil took place between 2017 and 2019, applying face-to-face interviews to 12,636 participants. A total of 11,987 individuals who responded to the discrimination scale participated in the analyses of these; 9,916 reported experiences of discrimination and responded to the HVS.
ELSA-Brasil was approved by the Research Ethics Committee of each of the institutions involved, by the National Research Ethics Commission (Conep 976/2006) and all participants signed an informed consent form.

Dimensional Structure and Invariance between Subgroups
The existence of previous models on the dimensionality of the discrimination scale 2,22 guided the psychometric evaluation of this instrument from the CFA. Given the limited knowledge about the psychometric properties of the vigilance scale, EFA was initiated, followed by CFA. KMO index and Bartlett's sphericity test were used to verify the adequacy of the data and parallel analyses as criteria to verify the retention of factors. The estimation of the parameters was performed by Weighted Least Squares Mean and Variance Adjusted (WLSMV), with implementation of a polychoric matrix 23 . The minimum criterion adopted for the standardized load of the items was 0.50 and loads ≥ 0.70 were considered ideal 24 . In addition, the evaluation of the residual correlations between the items, the modification indices and the values of expected changes of parameters were explored.
To evaluate the adequacy of the model, three indices were considered: the Comparative Fit Index (CFI), the Tucker-Lewis Index (TLI) and the root mean square error of approximation (RMSEA). RMSEA values of less than 0.06 are preferable, but up to 0.08 are acceptable. Its 95% confidence interval (95%CI) was also considered as an additional assessment, with the upper limit not exceeding 0.08. Models with good fit have CFI and TLI of approximately 1: indices ≥ 0.90 are acceptable and ≥ 0.95 are preferable 24 . Convergent validity was evaluated by the average variance extracted (AVE) and internal consistency by the composite reliability (CR), being considered acceptable values when AVE ≥ 0.50 and CR ≥ 0.60 24 .
In the case of day-to-day discrimination, multiple-group CFA was conducted for self-reported race/skin color, gender, age groups, and schooling to assess whether each subgroup has the same structure identified in the CFA without stratification. For comparison of invariance with those who self-declared white, the indigenous (n = 98) and yellow/Asian (n = 308) groups were excluded due to the low proportion among the participants, while the categories of blacks and browns were united, since they presented similar adjustment parameters in the models configured by racial subgroups.
The results of the configural models, by subgroup, with acceptable adjustments, allowed to proceed with the tests for measurement equivalence. This step consists of comparing the subgroups, from three sequential and interdependent models: 1) the reference model: less restricted and allows for evaluation of the configural invariance, that is, the equivalence of the factor structure, if the number of factors and the distribution of items among them are maintained among the subgroups of race/skin color, sex, age group and education; 2) metric invariance: evaluates the equivalence of the pattern of factor loads between subgroups, thereby adding a constraint to the first model: equal factor loads between subgroups; and 3) scalar invariance: evaluates the equivalence of intercepts, that is, whether individuals with the same score in the latent construct would obtain a similar score in the observed variable, regardless of the subgroup of which it is part. This model constrains not only the factor loads, but the variances to be equal between groups 24,25 .
At each stage, we compared the adjustment indices with the indices of the previous model, mainly evaluating the magnitude and direction of the variations in the incremental indices: if the more restrictive model showed a reduction in the CFI ≥ 0.010 complemented by an increase in the RMSEA ≥ 0.015 26 , then the invariance hypothesis was rejected. The chi-square test (χ²) for model comparison was used, but interpreted with caution due to its sensitivity to sample size 25 . All analyses of these steps were conducted in the software Mplus 8.12.

Test-retest Reliability Study
To evaluate the temporal stability of the items and scores, the EDS and HVS were applied twice with an interval between seven and 14 days (mean of ten days) in a random subsample of 260 participants, of which 149 reported experiences of discrimination and also responded to HVS.
Intra-observer reproducibility was assessed by the kappa coefficient with quadratic weighting (Kp), interpreted according to Fleiss's recommendations 27 . The intraclass correlation coefficient was used to evaluate the reproducibility of the scale scores in the test and retest, and the results were considered satisfactory when they reached minimum values of 0.70 27 . These analyses were conducted in R software, version 4.0.3, with a significance level of 5%.

RESULTS
The literature review on the topic of discrimination and the debate among researchers specialized in the area showed that there was conceptual and item equivalence, pointing to the relevance of the original instrument in our culture.
Most of the EDS items were considered of little difficulty for translation. Only five items were considered of moderate difficulty and generated some inconsistency among translators or debate among researchers, because they had unusual terms and expressions in our context. All change decisions, described below, were corroborated in the pre-test rounds: (1) You https://doi.org/10.11606/s1518-8787.2022056004278 receive poorer service than other people at restaurants or stores: in this case, the expression poorer service was adapted for the equivalent in Portuguese of "worst quality care"; (2) You are threatened or harassed: since we did not find a colloquial term for harassment, we opted for "threatened or harassed/embarrassed"; (3) You are followed around in stores: using the expression "be followed" might not represent well the semantic content of the item, since in Brazil some sellers are instructed to follow customers as a sign of greater attention paid to them. As such, we adapted it to "treated suspiciously and is watched in places like stores." In the evaluation of the translators and in subsequent pretests, there were no difficulties in the translation or interpretation of the HVS items.
The original version of the scales, as well as its final version, are presented in Chart. The discrimination and vigilance scales are formed, respectively, by 10 and six situations/items with answers in the Likert format. For the EDS, each answer was scored from 1 (never) to 6 (almost every day), with a score ranging from six to 60 points; for the HVS it ranged from 1 (never) to 5 (very often), with a total score between six and 30 points. The higher the score, the more frequent the experience of discrimination and vigilance.
Chart. Items of the everyday discrimination scale and the heightened vigilance scale, original version and final version in Brazilian Portuguese.

Original English version Final version in Brazilian Portuguese
Everyday discrimination scale (EDS) Escala de discriminação no dia a dia (EDD) In your day-to-day life, how often do any of the following things happen to you?
1. You are treated with less courtesy than other people are.

O(a) Sr(a) é tratado(a) com menos gentileza do que as outras pessoas.
2. You are treated with less respect than other people are.

O(a) Sr(a) é tratado(a) com menos respeito do que as outras pessoas.
3. You receive poorer service than other people at restaurants or stores.

Dimensional Structure and Invariance Between Subgroups
The study participants were mostly women (56%), in the age range of 40 to 59 years (53.8%), with complete higher education (58%) and self-declared as white (52%). Similar characteristics were observed among participants in the subsample of the test-retest reliability study.
In the CFA for the discrimination scale (Table 1) For the evaluation of the HVS, we proceeded to the EFA, initially, by the method of extraction of the main axes, oblique rotation. The KMO index for sample adequacy was 0.75, considered good, indicating that the matrix was factorable. Bartlett's sphericity test with significance levels p < 0.00 was considered adequate, indicating that the correlational matrix was not an identity matrix. Parallel analyses indicated the retention of two factors. Exploratory factor analysis with one factor showed very low loads for items 5 and 6: 0.495 and 0.363, respectively. This configural model also did not present good adjustment indicators: CFI = 0.918; TLI = 0.864; RMSEA = 0.138. The results of EFA with two factors improved these indicators, but showed cross-loads in items 3 and 6, as well as residues with high values. The CFA with single factor solution presented acceptable results after insertion of residual correlations between items 1 and 2 and between items 4 and 5 of the scale: CFI = 0.991; TLI = 0.980; RMSEA = 0.053. However, items 5 and 6 continued with loads less than 0.500. The composite reliability result for this model (0.668) was satisfactory, but the convergent validity (0.258) was lower than recommended ( Table  2). With low loads and high residuals in the confirmatory factor analysis of the HVS, we chose not to proceed with the models configured by subgroup and with the tests for equivalence of measurement of this scale.
The evaluation of measurement equivalence for EDS showed the indicators CFI (≥ 0.95) and RMSEA (≤ 0.080) with acceptable adjustments for configural invariance in the comparisons of the four groups. In the metric invariance, these adjustment indicators improved in all group comparisons, i.e., the RMSEA was significantly reduced, without overlapping the 95%CI with the estimate of the previous model, and there was an increase in the CFI that was greater than 0.990 in all cases. The scalar invariance model indicated constancy only for comparisons in sex subgroups. For comparison between age groups, the reduction in CFI was borderline, going from 0.992 in the metric invariance model to 0.981 in the scalar model (Δ of -0.011), while the increase in RMSEA was tolerable (Δ of 0.007). Scalar invariance was not reached for comparisons in the race/skin color and education subgroups. In both cases, the increase in RMSEA was greater than 0.015 (0.023 and 0.016, respectively) and the reduction in CFI was greater than 0.010 (-0.024 and -0.018, respectively) ( Table 3).

Test-retest Reliability Study
For the EDS, the test and retest scores ranged respectively from 10 to 48 (mean = 18.07) and from 10

DISCUSSION
Following the steps recommended by the literature 20 regarding cross-cultural adaptation, the results showed that the Brazilian version of the EDS presents acceptable cross-cultural adaptation, which allows its future use in epidemiological studies. However, our analyses do not corroborate the use of the discrimination-related HVS in the current format.
Regarding the EDS, our analysis supported the unidimensionality of the scale, similar to other studies 2,22 , and was consistent with previous research on local dependence or high correlation between items 1 and 2 and between items 8 and 9 13,22,28 . A psychometric study that also qualitatively explored the scale indicated that the correlation between items 1 and 2 may be due to redundancy, since these items were seen as having similar meaning by the respondents 28 , although this was not captured in the pre-tests carried out within the framework of ELSA-Brasil. On the other hand, the correlation between items 8 and 9 can be explained by the nature of these experiences, related to acute forms of discrimination, directly addressed and more evident, and distancing themselves from the subtle or chronic nature of the experiences for which the EDS was proposed 22 .
Taking into account these assessments, a summary version of the five-item discrimination scale (α = 0.77) was developed for the Chicago Community Adult Health Study 29 and it contemplates changes in items that we identified as correlated: the first two items of the scale were joined (treated with less courtesy/respect than other people) and item 8 was removed, keeping only item 9, plus the term "embarrassed". Items 6, 7 and 10 were not included in the short version. We therefore encourage future research in Brazil and other Portuguese-speaking countries to evaluate the appropriateness of modified summary versions of the scale.
The models used to evaluate the equivalence of measurement between groups in the discrimination scale indicated metric invariance in all cases, providing evidence that respondents use the scale in a similar way between subgroups, so the differences between values can be compared 24 . However, scalar invariance was not achieved for comparisons between racial groups and groups of different schooling levels. These results confirm the conclusions of more recent studies on measurement invariance in EDS, which reported, when considering general discrimination without attributing motivation, lack of equivalence (or non-invariance) between racial groups and based on schooling 12,13 . These studies, however, differed regarding the invariance between other subgroups: while one indicated that only comparisons between age groups of the estimates of the discrimination scale can be considered significant 13 , the other suggested that comparisons between men and women are appropriate 12 . It should be noted, however, that the change in CFI values adopted in these studies, with reductions ≥ 0.002, were considered an indication of non-invariance between groups. This cut-off point was more conservative than the one adopted in our analyses.
Regarding the HVS, unacceptable indicators were observed and, as far as it was possible to evaluate, few studies evaluated the psychometric properties of this scale, which limits comparisons. Similar challenges related to scale dimensionality have been reported in other studies. For example, a study in the USA indicated two dimensions in exploratory and confirmatory factor analyses: one related to what the authors termed preparation (1 to 3) and another related to what they called caution (from item 4 to 6) 19 . The same study, using a single-factor model, found unsatisfactory adjustment (CFI = 0.94 and RMSEA = 0.12) and chose to follow the analysis with a two-factor model with improvement in indicators (CFI = 0.98 and RMSEA = 0.07). Additional information on the scale structure comes from the study that originally published the HVS and reported principal component analysis, with one component (eigenvalue = 3.42) accounting for 57% of the standardized variance (loads > 0.69) 15 . https://doi.org/10.11606/s1518-8787.2022056004278 These aspects have also motivated the revision of this scale, and a summarized version with four items (α = 0.72) was applied in the Chicago Community Adult Health Study 30 . This version removes two items that were involved in residual correlations in our analysis: items 1 and 5. Another study used an even more summarized version of the HVS with removal of item 4 (α = 0.66) 16 and LaVeist et al. 17 used the original version, with the removal of item 6 (α = 0.69).
Although the abbreviated HVS is used in research on its association with health outcomes, additional psychometric assessments are needed in different contexts to elucidate whether the scale can be used in the population and in the comparison between subgroups. We found only one study that evaluated the measurement invariance of HVS, in which comparisons between transgender and cisgender and between cisgender subgroups could be performed, since the scale works equivalently between these groups. On the other hand, comparisons between transgender subgroups required caution, since partial metric invariance and partial scalar invariance were found 19 .
The present study shows as its strengths the quality of the data collection process, as well as a very broad appreciation of the assessments that make up the stages of cross-cultural adaptation in a comprehensive sample in the country and with different sociodemographic characteristics. However, the composition of the population of ELSA-Brasil -younger and older adults, employed or retired, and with higher education than the average of the Brazilian population -limits its representativeness.
Finally, we highlight that the EDS obtained acceptable psychometric results for use in ELSA-Brasil and similar populations. It was not possible to identify acceptable psychometric properties for the vigilance scale. However, given the importance of the theme in epidemiological studies in the Brazilian reality, it is advisable that the most recently proposed summary versions be used in new studies and evaluated on their relevance in our context.