Relationship between self-declared ethnicity , mitochondrial haplogroup and genomic ancestry in individuals from southeast of Brazil

In populations where there is a high degree of admixture, as in Brazil, the sole use of ethnicity self-declaration information is not a good method of ethnic classifi cation. We evaluate the relationship between self-declared ethnicities with genomic ancestry and mitochondrial haplogroups in 492 individuals from Southeastern Brazil. Mitochondrial haplogroups were obtained by analyzing the hypervariable regions of mitochondrial DNA (mtDNA) and genomic ancestry was obtained using 48 autosomal ancestry informative markers (AIM). Of the 492 individuals, 74.6% self-declared as white, 13.8% as Brown and 10.4% as Black. In relation of mtDNA haplogroups, 46.3% presented African mtDNA and the major genomic ancestry was European (57.4%). When we performed the distribution of mtDNA and genomic ancestry according to the self-declared ethnicities, from 367 individuals self-declared white, 37.6% showed African mtDNA, and had a higher contribution of European ancestry (63.3%). The 68 individuals self-declared brown, 25% showed Amerindian mtDNA and few differences in the averages contribution of European and African ancestries. Those 51 subjects self-declared black, 80.4% had African mtDNA and the main contribution of African ancestry (55.6%). The Brazilian population had a very uniform degree of Amerindian genomic ancestry, and only by using genetic markers (autosomal and mitochondrial) we were able to capture this information. Epidemiological studies should use the association of these methods to provide complementary information.


INTRODUCTION
I n the last few years, literature has proposed various applications to ethnicity information, as in forensics science, epidemiological studies, in clinical and pharmacological trials.However, in highly admixed populations, such as in Brazil, this personal information cannot provide the same robust estimative as in less diverse populations 1,2 .
Brazilian population is one of the most heterogeneous in the world.Consequently, the genotypic and phenotypic characteristics of various groups have been added to our native population 3,4,5 .This high rate of admixture made physical appearance characteristics such as skin and eye color, hair, shape of lips and nose not good indicators of the geographical origin of a Brazilian individual's ancestors 6 .
Ancestry Informative Markers (AIMs) are autosomal markers, which have been used to estimate the genomic ancestry of a population or individual, since they show differences in allele frequencies between two or more distinct populations 7,8 .These markers have a great advantage with respect to physical features because they are constant throughout life 6,9 .
Mitochondrial DNA (mtDNA) has also proved to be a good marker for the inference of maternal ethnicity.Several studies have indicated the feasibility of inferring the probable geographic origin of an individual from the sequence of the hypervariable regions (HV) of mitochondrial genome.These studies clearly demonstrate that the mitochondrial sequence alone does not determine one's ethnicity, by presenting exclusively maternal inheritance 1,10 .

OBJECTIVE
The aim of this study was to evaluate the relationship between self-declared ethnicity, genomic ancestry and mitochondrial haplogroups in 492 individuals from Southeast Brazil.

Population Samples
We studied 492 individuals living in São Paulo (Southeast of Brazil).The volunteers answered a questionnaire that included a multiple-choice question on self-declared ethnicity, based on the method used by the Brazilian Institute of Geography and Statistics (IBGE) national census survey, which classify individuals as "Brancos" (i.e.white), "Pardos" (i.e.brown), "Pretos" (i.e.black), "Amarelos" (i.e.yellow) and "Indígenas" (i.e.indigenous).All individuals signed an informed consent form, and the research protocol was approved by the Ethics Committee of the Clinical Hospital from Medical School, University of São Paulo.

Mitochondrial DNA Analysis
DNA was extracted from peripheral blood leukocytes following standard salting out techniques 11 .
The DNA were amplifi ed by a single PCR reaction, using primers designed using the Primer3 program.The primers used were L15879 (5'-AATGGGCCTGTCCTTGTAGT-3 ') and H727 (5'-AGGGTGAACTCACTGGAACG-3').The amplifi ed segment refers to the nucleotide sequence 15879-727, containing 1417 base pairs (bp), comprising the three hypervariable regions of interest (HV1, HV2 and HV3).Samples were sequenced forward and reverse.Capillary electrophoresis was performed in a sequencer ABI3130 and results were analyzed using BioEdit program.
Individual sequences were compared with the Cambridge Reference sequence (rCRS) 12,13 using the ClustalW software.Differences found in each sequence regarding the rCRS were typed following the nomenclature recommendations 14,15 .Classifi cation of mtDNA haplogroup was done using mtDNAmanager and Phylotree 16 programs.

Ancestry Informative Markers Analysis
The evaluation of genomic ancestry was conducted using forty-eight biallelic AIMs type insertion-deletion (INDELS) from autosomal chromosomes, 16 markers of African ancestry, 16 markers of European ancestry and 16 Native American.They have been successfully used to assess parental genetic contribution in different populations 17,18 .
The results of the distribution of self-declared ethnicities, mitochondrial haplogroups and genomic ancestry are present in the Table 1.

DISCUSSION
Of the 492 individuals who reported their ethnicity, 74.6% (n=367) were defi ned as white, suggesting a social process of "whitening".According to the IBGE, the Brazil is composed of 48.2% of individuals self-declared as white and in the Southeast is 56.7% 19 .Various genetics studies with different Brazilian sub-populations had demonstrated the discrepancy between the self-declaration information and the individual's genetic background 4,20 , including this one.These facts demonstrate the existence of problems in the methodology used for the acquisition of information on ethnicity in the Brazilian population, resulting in a major diffi culty in using these data in epidemiological studies.
Regarding mitochondrial analysis, our results showed that 46.3% (n=228) of all of the individuals and 37.6% (n=138) of the individuals who selfreported as white presented African mtDNA (Table 1).When comparing the mtDNA in each ethnicity, there are some disparities: in individuals who self-declared as white 37.6% (n=138) had African mtDNA, while 25% (n=17) in individuals who selfdeclared as brown had Amerindian mtDNA (Table 1).This emphasizes the fact that in a population where there is a high degree of admixture and where superfi cial physical traits can vary with age and environmental factors, only the self-declaration of ethnicity is not a good method for ethnic classifi cation 6,20 .
Genomic ancestry found in this study showed that Brazilians have major contribution of European ancestry (57.5% ±22.2%) (Table 1).Although these data corroborate with previous studies in relation of major contribution of genomic ancestry 2,20 , it is possible to observe that there were differences between our data and these studies because we found a slightly larger percentage of African genomic ancestry and a slightly lower percentage of European genomic ancestry in the black and brown individuals.These differences could be mainly due to the socio-economic bias that exists in the Brazilian population.
Interestingly, using only genetic markers (autosomal or mitochondrial) we could capture the information about the Amerindian component of the Brazilian population, even with only one individual self-declared as indigenous.We observed 14.2% (±10%) of genomic contribution of Amerindian ancestry and 28.7% (n=141) of subjects had Amerindian mtDNA when we look at the sample as a whole (Table 1).
Some studies have demonstrated that there is no strong correlation between self-declared ethnicity and genetic ancestry 6,8,20 , however in the present study we suggested that increased information content can be brought through the using of combined information from self-declared ethnicity with the different types of inheritance, uni-parental and bi-parental.Because using mtDNA, which provides a clear pattern of historical events that are not obscured by factors of recombination 21 , and information of AIMs, which can estimate individual and population interethnic admixture 7 , one can have an accurate reconstitution of the genetic ancestry of a certain population or individual.Nevertheless, selfdeclared ethnicity can bring different information to the researcher since it is the result of visual traits like skin color combined with socio-economic and cultural aspects and is determined by factors that may not be captured by genetic markers of admixture or geographical ancestry 22 .

CONCLUSION
In conclusion, this study revealed that the genomes of most Brazilians are mixed and genetic markers are also capable of providing new valuable insights about the current structure of Brazilian population.Regardless of their skin color, the majority of individuals from Southeast of Brazil presented African matrilineages, they have a high degree of European genomic ancestry, and very uniform degree of Amerindian genomic ancestry.Due to the overall low correlation between ethnic, genetic and matrilinear information we propose that, especially in admixed populations, the use of a three dimensional construct may provide more predictive power to the study of the association between ethnicity and complex human phenotypes.

TABLE 1 .
Distribution of mtDNA and genomic ancestry according to the self-declared ethnicities