Prevalence evolution of SARS-CoV-2 infection in the city of São Paulo, 2020–2021

ABSTRACT OBJECTIVE To estimate the evolution of the prevalence of SARS-CoV-2 virus infection among residents aged 18 years or over in the municipality of São Paulo. METHODS This is a population-based household survey conducted every 15 days, between June and September 2020, and January and February 2021. In total, the study comprised 11 phases. The presence of antibodies against SARS-CoV-2 was identified in venous blood using a lateral flow test, Wondfo Biotech. In the last phase, the researchers combined it with an immunoenzymatic test, Euroimmun. The participants also answered a semi-structured questionnaire on sociodemographic and economic factors, and on social distancing measures. Prevalence estimates and the 95% confidence interval were estimated according to regions, Human Development Index, sex, age group, ethnicity, education, income, and variables associated with risk or prevention of infection. To compare the frequencies among the categories of each variable, the chi-square test with Rao-Scott correction was used, considering a significance level of 5%. RESULTS In total, 23,397 individuals were interviewed and had their samples collected. The estimated prevalence of antibodies against SARS-CoV-2 ranged from 9.7% (95%CI: 7.9–11.8%) to 25.0% (95%CI: 21.7–28.7). The prevalence of individuals with antibodies against the virus was higher among black and brown people, people with lower schooling and income, and among residents of regions with lower Human Development Index. The lowest prevalences were associated with recommended measures of disease protection. The proportion of asymptomatic infection was 45.1%. CONCLUSION The estimated prevalence of the infection was lower than the cumulative incidence variation, except for the last phase of the study. The differences in prevalence estimates observed among subpopulations showed social inequality as a risk of infection. The lower prevalence observed among those who could follow prevention measures reinforce the need to maintain social distancing measures as a way to prevent SARS-CoV-2 infection.


INTRODUCTION
In December 2019, the World Health Organization (WHO) received a notification of pneumonia outbreak in Wuhan, Hubei Province, People's Republic of China. The etiological agent was quickly identified: a new coronavirus called SARS-CoV-2. On January 30, 2020, WHO declared the disease outbreak caused by the virus (COVID- 19) a Public Health Emergency of International Concern (PHEIC), the highest WHO alert level according to International Health Regulations (IHR). On March 11, 2020, WHO declared the outbreak of COVID-19 a global pandemic¹.
In Brazil, the Ministry of Health (MoH) declared the outbreak a Public Health Emergency of National Concern (PHENC) on February 3, 2020 2 . The first case was diagnosed on February 26, 2020, and on March 20, 2020, MoH announced the community transmission of COVID-19 in the national territory 3 .
From the date of the first case until March 28, 2021, the virus infected 12,490,362 people and caused 310,550 deaths, with an incidence rate of 5,943.6 per 100,000 inhabitants, and a mortality rate of 147.8 per 100,000 inhabitants 4 .
In the municipality of São Paulo (MSP), until March 26, 2021, the health report system (e-SUS Notifica) had 2,390,256 cases of acute respiratory infection (ARI), of which 609,380 (25.5%) were confirmed for SARS-CoV-2 infection. In the Epidemiological Surveillance Information System of ARI (SIVEP Gripe), 161,758 cases of severe acute respiratory syndrome (SARS) in the city were reported, of which 90,049 (55.7%) were confirmed for COVID-19. Among these cases, 21,051 (3%) resulted in death.
Given the epidemiological situation of COVID-19 in the MSP, on March 23, 2020, the city council adopted strategies for reducing the disease transmission, establishing voluntary quarantine with the closure of non-essential businesses 5 . On May 29, 2020, Decree No. 59,473 gradually reopened some non-essential services 6 ; therefore, it became extremely important to know the serological situation of the population regarding COVID-19 infection to support decision-making.
The researchers designed this serial serological survey to represent people with 18 years or older in the MSP and used population-based data to direct strategies to combat the pandemic and evaluate the effects of COVID-19 actions to prevent and control the disease. In this sense, our study aimed to estimate the prevalence of SARS-CoV-2 virus infection in these adults living in the city, the proportion of asymptomatic individuals with positive tests, and describe the evolution of the infection prevalence.

METHODS
This is a serial serological survey to estimate the prevalence of SARS-CoV-2 infection in the city of São Paulo.

Study Area
The capital of the state of São Paulo has a Human Development Index (HDI) of 0.805 and an estimated population in 2020 of 11.9 million inhabitants, of which 9.2 million were 18 years or older. The city also has a large economic and social disparity, mainly in the education level, income, and housing -the Gini Index in 2010, for instance, was 0.6453 7 .
For planning healthcare actions, the city was divided into six regions -North, Center, West, Southeast, East and South -, 27 Technical Health Supervisions, 472 primary healthcare units (PHU), and their respective coverage areas (CA) (Figure 1). Figure 1 shows the social and economic disparities between and within the regions of the MSP.

Sample Methodology
In 2020, we conducted a pilot study in June and seven cross-sectional studies periodically in different samples, every 15 days, from June to September. In 2021, four other studies were carried out in January and February. In each phase, a stratified probabilistic sample with simple casual sampling was used within each stratum 8 .
For a prevalence ranging from 5% to 20%, we determined the sample size so that the estimates had coefficients of variation lower than 15% for the MSP, and lower than 30% for the regions, except for the Central and West regions, which have a smaller number of PHU. This process resulted in eight dwellings per stratum. The researchers increased the sample size to 12 dwellings to maintain the accuracy of the estimates affected by non-response (closed households or refusals to participate in the study). In the Central and West regions, the sample size per CA-PHU was also increased to 15 to compensate the smaller number of PHU in the region.
Statistically, this is a single stage sampling (the dwelling is the sample unit); however, operationally, the selection comprised two stages: in the second stage, the resident was selected in the household using the last birthday method. The selected person collected material for laboratory analysis and was interviewed. If the selected property was closed or the resident was not at home at the time of the visit, the PHU's team would return twice. The addresses were accessed from a database composed by three different types of records updated between the study phases: a) residential property taxpayers (IPTU) of 2020; b) hydrometers of sanitation company (SABESP) of 2017; c) Family Health Strategy (FHS). The distribution of dwellings by record was unequal in the city. Some areas did not have representativeness from one of the records or the number was insufficient in the sample. The number of dwellings in each stratum was selected from the database, proportionally to the number of dwellings in each record.

Testing and Questionnaire
The researchers used SARS-CoV-2 Antibody test ® (Wondfo Biotech, Guangzhou, China), which detects IgM/IgG antibodies against the virus, without discriminating the type of immunoglobulin. In Brazil, the test is distributed by the MoH under the name "One Step COVID-2019 Test ® ", and the legal manufacturer is Celer Biotecnologia S/A 9 .
The test is based on the principle of lateral flow immunochromatographic for the detection of IgG/IgM antibodies against SARS-CoV-2 in human blood, serum or plasma 10 .
In this survey, we used venous blood samples to obtain the serum, since the validation study of the OneStep Wondfo Test indicated an increase of sensitivity with serum sample in comparison with the capillary blood obtained by digital puncture. Pellanda et al. 11 estimated a sensitivity of 84.8% and a specificity of 99.0% by assessing the results of four validated studies.
In the last phase, to increase the sensitivity and the specificity of virus detection, the samples were also processed using an immunoenzymatic test (anti-SARS-CoV-2 ELISA) from Euroimmun, which uses the S1 spike protein as an antigen for the detection of IgG antibodies against the virus in the serum 12,13 .
The official laboratory results were reported to the study participants. They answered a semi-structured questionnaire with questions about sex, age, schooling, ethnicity (self-reported), family income, household size, symptoms potentially related to COVID-19, healthcare service use, previous SARS-CoV-2 test, contact with suspected or confirmed cases of COVID-19. Also, the interviewees were questioned about their work regime, social distancing measures adopted, facemask use, visit to non-essential places and public transportation use.

Data Analysis
Data were included in a standardized electronic form in FormSUS/DATASUS, version 3.0 14 . Data processing and analysis were performed using the statistical packages R and STATA version 13. Indeterminate results were classified in the analysis as negative.
Data were analyzed considering five regions of the city: Central-West; East; North; Southeast; and South. Prevalence estimates were weighted according to the sampling design.
Prevalence and their 95% confidence intervals (CI) were estimated according to the region and HDI of the city, sex, age, ethnicity, education, income and presence of symptoms, risk factors, recommended measures for prevention and control of the disease and social distancing. Moreover, we calculated the proportion of asymptomatic infections.
For statistical analyses, four phases were randomly chosen from the 11 phases conducted in this study. The Rao-Scott chi-square test was used to compare the frequencies between the categories of each variable, considering a significance level of 5%.

Ethical Aspects
The study was approved by the Brazilian's National Ethics Committee (CA AE 32947920.3.0000.0008). Blood samples were collected and the individuals were interviewed only after signing an informed consent form.

RESULTS
Out of 63,372 selected individuals with 18 years or over, residing in the five regions, 23,397 (36.9% participation rate) were interviewed and collected samples. Table 1 shows the prevalence estimates of antibodies against SARS-CoV-2 in the city and their respective 95%CI for each phase of the study. All values found were within the range of CI variation of the previous phases, except in the last phase, when the researchers used two laboratory tests to increase the sensitivity and the specificity of virus detection. Table 2 shows the prevalence estimates of SARS-CoV-2 infection in phases 1, 4, 7, and 11. The results of phase 11 with only the rapid test (11a) and with the addition of the ELISA test (11b) are presented separately.
The prevalence estimates of this study did not follow the evolution of the accumulated cases of SARS-CoV-2 infection in the MSP, except for the last phase, as shown in Figure 2A.
The design effect (deff) varied between the study phases, from 1.93 in phase 7 to 3.09 in the last phase; in phase 1, the deff was 2.71, and in phase 4, 2.95.
The coefficients of variation of prevalence estimates for the municipality were below 15% in all phases and below 30% in all regions, except for the Central-West region.
There was a significant difference between the regions in phases 4 and 7. The Central-West region had the lowest estimates, whereas the Southern region had the highest. Table 2 also presents the prevalence estimates of SARS-CoV-2 infection, according to demographic and clinical characteristics. Regarding age, the estimated prevalences varied between the phases of the study: in phase 1, the highest prevalence was found in the age group from 50 to 64 years; in phase 4, from 18 to 34 years; in phase 7, from 35 to 49 years, and in phase 11, from 18 to 34 years. Only in phase 7, the difference among age groups was significant.
In all phases, women had the highest prevalence, especially in phase 11, when the difference was statistically significant.The estimated prevalence of black and brown people was the highest in all phases. There was a significant difference in ethnicity in all phases.
The highest prevalence estimates were found in the groups with the lowest education levels. Table 2 shows significant differences between education levels in phases 1, 4, and 7.
Prevalence estimates were inversely related to the family income category that were distributed in A/B (≥ R$ 8,641.00), C (R$ 2,005.00 to R$ 8,640.00) and D/E (≤ R$ 2,004.00). Moreover, we used HDI ranges to estimate regional differences: range A (0.84 to 0.95) had the lowest prevalence, whereas range C (0.62 to 0.73) had the highest estimates. The difference among HDI ranges and family income categories were significantly different, except in phase 11 (Table 3).      Regarding the work situation of the interviewees, the study showed that those in telework had lower prevalence when compared to the other categories, especially in phases 1 and 4 (3.9% and 4.4%, respectively). Likewise, those who did not use public transportation and adopted social distancing measures had the lowest prevalence estimates. The difference was significantly in phases 1 and 11.
The difference between the categories for mask use -"always," "most of the time," and "sometimes" -was significant in phase 11 (24.4%).
Interviews and sample collection could not be conducted in 39,169 (62.6%) households. In 11,362 (29%) addresses, the selected dwelling was not identified, and in 6,489 (16.6%), the property was closed. In 10,443 (26.7%) dwellings, the selected individual refused to participate; in 6,436 (16.4%), they were not present at the visit, in 4,404 (11.2%), other reasons were informed, and in 35 (0.1%) the selected individual was vaccinated. The non-response rate showed an increase trend throughout the study phases, from 53.8% in phase 1 to 67.9% in the last one.

DISCUSSION
The results showed that most people in the city are still susceptible to the virus, considering the increasing number of infections during the pandemic. The prevalence of individuals with positive results was higher among black and brown people when compared with white people. Also, the prevalence was inversely associated with education level, income, and with the HDI of the primary healthcare unit coverage area (CA-PHU) of the selected dwelling. The lowest prevalences were associated with protection measures against the disease.
In 2020, several studies in Brazil and around the world aimed to estimate the prevalence of SARS-CoV-2 in their populations [15][16][17][18][19][20][21] . The results of different prevalence studies should be carefully compared, as they depend on the pandemic evolution and laboratory tests used 22 .
In our study, the estimated prevalence of antibodies against the virus in the city ranged from 9.7% (95%CI: 7.9-11.8) in phase 1 to 16% (95%CI: 13.1-19.3) in phase 10; in phase 11, when we used two laboratory tests, the estimate was 25% (95%CI: 21.7-28.7). In Brazil, Hallal et al. 20 conducted a large study involving two serological surveys in 133 "sentinel cities" in all Brazilian states. They used the same laboratory test of our study with finger prick blood samples and estimated prevalences between 0% and 25%. In the state of Espírito Santo, Gomes et al. 22 conducted a population-based serial cross-sectional study in 11 cities and they found a prevalence of 2.1% using the Celer Technologies Inc test in finger prick blood samples. In the city of São Paulo, a study with six administrative districts found a low prevalence (4.7%) using the MAGLUMI 2019-nCoV, an in vitro chemiluminescence immunoassay test 21 .
In our study, the highest prevalence was initially found in the age group from 50 to 64 years, which is a vulnerable population, most affected at the beginning of the pandemic. In the following phases, the younger age groups were the most prevalent, which is the economically active population circulating around the city. This fact may be associated with the economic recover of São Paulo with the reduction of restrictions and reopening of commercial businesses 6 . Similar to other studies, there was no difference between the prevalence of men and women 16,17,20 .
Disparities regarding ethnicity and level of education are consistent with the inequalities associated with demographic, socioeconomic and risk factors for the disease transmission in the same city, as shown by Tess et al. 21 and Rosenberg et al. 15 .
The prevalence of the virus was inversely associated with the individual's income and with the HDI of CA-PHU -therefore, the lower the income and HDI in the CA-PHU, the higher the prevalence of SARS-CoV-2 infection. In accordance with the studies by Bermudi et al. 23 and Menezes et al. 24 , vulnerable populations with lower income and poor housing conditions have higher risk of virus transmission and difficulties to access health services for diagnosis and treatment.
The proportion of individuals that have not reported COVID-19 symptoms since the beginning of the pandemic among those who tested positive for SARS-CoV-2 virus varied in the study phases. The proportion of up to 45.1% was high when compared with other studies. The proportion of asymptomatic individuals with 18 years or older reported in seroprevalence studies in England 18 and in a USA city (Chelsea) 25 were 32.2% and 24.7%, respectively. In line with our study, Tess et al. 21 found a high proportion of asymptomatic individuals living in six districts of the city of São Paulo (45.3%).
The risk associated with the work situation was lower among individuals working from home when compared to those who work at the office -this result was consistent with other seroprevalence studies of SARS-CoV-2 26,27 .
The use of masks was a protective measure against the infection and was corroborated by studies carried out in the state of Maranhão 26 and in China 28 .
Our study has some limitations: it included a high rate of non-response, some addresses or dwellings selected in the database were not identified (29.0%), and some individuals refused to participate in the study (26.7%). In addition, venous blood samples were collected instead of finger prick to increase the sensitivity of the test, as shown by Hallal et al. 20 and Tess et al. 21 . The large number of teams (471) composed of primary healthcare employees who were responsible for data collection may have contributed to divergences in the approach of individuals to undergo the interview process and sample collection. The design effect (deff) observed for the prevalence in the municipality and in the regions was higher than the expected for stratified samples, possibly due to the large number of strata in the study 8 . The high deff contributed to the low accuracy of the estimates; therefore, future studies should review the sample design. Individuals under 18 years of age were not included in the sample, the use of a second test in phase 11 enhanced its sensitivity, and ELISA detected 53.8% more positive cases than the rapid test. Moreover, the prevalence estimates may have been underestimated in phases 1 to 10.
Serological tests may present false-negative results in the first days of the infection; therefore, it has little diagnostic value for acute cases. The proportion of negative cases among symptomatic individuals with sample collection performed up to the 14th day of the date of symptoms onset decreased throughout the study, from 18.7% in phase 1 to 10.1% in the last phase. The probability of selecting individuals during the first 14 days of infection in the sample was decreasing and the recall bias of the date of symptoms onset was 16.4%.
In conclusion, the prevalence variation of SARS-CoV-2 infection was lower than the variation of the cumulative incidence rate, except for the last phase of the study. The differences in prevalence estimates, according to protective measures against SARS-CoV-2 infection, reinforce the need to maintain social distancing, mask use and telework in all age groups and social classes.
Sequential phases will allow the monitoring of the pandemic evolution and will verify the effectiveness of the current recommended protection measures for the population. Further studies should consider vaccinated people and the influence of vaccination on the population.