Felkl Et Al, 2023. Ancestry Resolution of South Brazilians by Forensic 165 Ancestry-Informative SNPs Panel

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Forensic Science International: Genetics 64 (2023) 102838

Contents lists available at ScienceDirect

Forensic Science International: Genetics


journal homepage: www.elsevier.com/locate/fsigen

Ancestry resolution of South Brazilians by forensic 165


ancestry-informative SNPs panel
Aline Brugnera Felkl a, c, *, Eduardo Avila a, b, c, André Zoratto Gastaldo a, c,
Catieli Gobetti Lindholz a, Márcio Dorn a, c, d, Clarice Sampaio Alho a, c
a
Forensic Genetics Laboratory, School of Health and Life Sciences, Pontifical Catholic University of Rio Grande do Sul, Porto Alegre, RS, Brazil
b
Technical Scientific Section, Federal Police Department in Rio Grande do Sul State, Porto Alegre, RS, Brazil
c
National Institute of Science and Technology – Forensic Science, Porto Alegre, RS, Brazil
d
Institute of Informatics, Federal University of Rio Grande do Sul, Porto Alegre, RS, Brazil

A R T I C L E I N F O A B S T R A C T

Keywords: Forensic DNA phenotyping (FDP) includes biogeographic ancestry (BGA) inference and externally visible char­
Brazilian population acteristics (EVCs) prediction directly from an evidential DNA sample as alternatives to provide valuable intel­
Population genetics ligence when conventional DNA profiling fails to achieve identification. In this context, the application of
Biogeographic ancestry
Massively Parallel Sequencing (MPS) methodologies, which enables simultaneous typing of multiple samples and
Massively parallel sequencing
Precision ID Ancestry Panel
hundreds of forensic markers, has been gradually implemented in forensic genetic casework. The Precision ID
Ancestry Panel (Thermo Fisher Scientific, Waltham, USA) is a forensic multiplex assay consisting of 165 auto­
somal SNPs designed to provide biogeographic ancestry information. In this work, a sample of 250 individuals
from Rio Grande do Sul (RS) State, southern Brazil, apportioned into four main population groups (African-,
European-, Amerindian-, and Admixed-derived Gauchos), was evaluated with this panel, to assess the feasibility
of this approach in a highly heterogeneous population. Forensic descriptive parameters estimated for each
population group revealed that this panel has enough polymorphic and informative SNPs to be used as a sup­
plementary instrument in forensic individual identification and kinship testing regardless of ethnicity. No sta­
tistically significant deviation from Hardy-Weinberg equilibrium was observed after Bonferroni correction.
However, seven loci pairs displayed linkage disequilibrium in pairwise LD testing (p < 3.70 × 10− 6). Inter­
population comparisons by FST analysis, MDS plot, and STRUCTURE analysis among the four RS population
groups apart and along with 89 reference worldwide populations demonstrated that Admixed- and African-
derived Gauchos present the highest levels of admixture and population stratification, whereas European- and
Amerindian-derived exhibit a more homogeneous genetic conformation.

1. Introduction human groups, which are easily perceivable and recognized as charac­
teristic of such groups. As an example, pigmentation traits are one of the
Forensic DNA phenotyping (FDP) includes biogeographic ancestry most distinguishing of these physical appearance elements. Different
(BGA) inference and externally visible characteristics (EVCs) prediction aspects of phenotypic expression can, therefore, be correlated with the
directly from an evidential DNA sample as alternatives to provide different levels of genetic structure observed in human populations, and
valuable intelligence when conventional DNA profiling fails to achieve as such have been widely explored and investigated through techniques
an identification [1]. FDP may reduce the pool of potential suspects and based on ancestry-informative markers (AIMs), mainly autosomal single
hence guide investigations to find previously unknown perpetrators, as nucleotide polymorphisms (SNPs) [3]. AIMs present marked allele fre­
well as helping identify missing persons or mass disaster victims [2]. The quency divergences among populations from different geographic re­
indirect method of evaluating physical appearance provided by BGA is gions and are useful for determining an individual’s likely
based on a set of distinctive, particular features presented by some biogeographic ancestry or population of origin.

* Correspondence to: Laboratório de Genética Forense, Escola de Ciências da Saúde e da Vida, Pontifícia Universidade Católica do Rio Grande do Sul, Av. Ipiranga,
6681, Prédio 12C, Sala 233, Porto Alegre, RS 90619-900, Brazil.
E-mail address: [email protected] (A.B. Felkl).

https://doi.org/10.1016/j.fsigen.2023.102838
Received 24 August 2022; Received in revised form 15 January 2023; Accepted 22 January 2023
Available online 23 January 2023
1872-4973/© 2023 Elsevier B.V. All rights reserved.
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

Forensic DNA analysis is often confronted with highly degraded and ethical principles stated in World Medical Association’s Helsinki
contaminated samples, requirements for high precision and reproduc­ Declaration [16] and was approved by the National Research Ethics
ibility, besides time and cost considerations. In this sense, the advent of Committee of CEP/Conep system via Plataforma Brasil, under CAAE
Massively Parallel Sequencing (MPS) techniques – used for simultaneous number 15620919.3.0000.5336.
typing of a large number of targeted markers, with high throughput and
consequently reduced analysis time – had a hugely positive effect on 2.2. Samples, DNA extraction, and quantification
forensic sciences [4,5]. Soon after, commercial SNP-Panel-based kits for
sequencing on high-throughput platforms were introduced to the Oral swabs were obtained from 250 unrelated voluntary donors in
forensic community. Precision ID Ancestry Panel (formerly HID Ion the metropolitan region of Porto Alegre, Rio Grande do Sul (RS) State,
AmpliSeq™ Ancestry Panel) comprises a set of 165 autosomal southern Brazil. The population sample comprises 130 women and 120
ancestry-informative SNPs (AISNPs) previously selected by two labora­ men, with ages ranging from 18 to 75 years. Subjects provided pheno­
tories [6,7] and commercially available by Thermo Fisher Scientific typic, ethnic, and ancestry information in a self-evaluation form and
(TFS; Waltham, MA, USA) for BGA inference. The average amplicon agreed to the photographic registry. Based on self-declared data and
length is 120–130 bp, projected to successfully allow processing of hetero-attribution by multivariate phenotypic evaluation (including
highly degraded, low input, and other forensic challenging samples. eye, skin and hair color, and hair and facial morphology), volunteers
The Brazilian population is a multicultural and multiethnic nation were apportioned into four categories: European-derived Gauchos
with a complex demographic history, characterized by intense and (EURS, n = 92), African-derived Gauchos (AFRS, n = 62), Amerindian-
heterogeneous admixing processes that encompass three large conti­ derived Gauchos (AMRS, n = 22, obtained from direct descendants of
nental groups – Native Americans (NAM), European (EUR) settlers, and Guarani and Kaingang population groups from RS), and Admixed-
enslaved Sub-Saharan Africans (AFR) [8,9]. The influx of European derived Gauchos (ADRS, n = 74, characterized by an admixture of two
settlers at the end of the 15th century, mostly coming from the Iberian or three parental populations declared by family history and verified by
Peninsula, culminated in both asymmetric mating with Amerindian phenotype evaluation).
women and a drastic reduction of the native people due to diseases and Genomic DNA from buccal swabs was extracted with a standard
conflicts. Soon after, a large contingent of Africans, mostly from Western phenol-chloroform-isoamyl alcohol protocol. Extracted DNA was
African territory (Senegal, Gambia, and Guinea-Bissau), was forcedly quantified using Qubit™ 2.0 Fluorometer with Qubit™ dsDNA High
brought to Brazil as slaves. In the following two centuries, Africans were Sensitivity (HS) Assay Kit (TFS; Waltham, MA, USA) according to the
brought from Angola and Congo; and in the 19th century, the predom­ manufacturer’s recommendations.
inant component was from Mozambique [10]. Finally, late migratory
movements occurred in the 19th and 20th centuries, with the arrival of 2.3. Library preparation, quantification, and sequencing – Precision ID
Europeans (predominantly Germans, Italians, Portuguese, and Span­ Ancestry Panel
iards) and Asian migrants (essentially from Japan and Middle East
countries). These peoples met and mated among themselves in different Library prep of 132 samples was performed using Ion AmpliSeq™
ways, giving rise to a highly admixed multiethnic population [11]. Library Kit 2.0 (TFS; Waltham, MA, USA) combined with HID-Ion
Brazilian territorial occupation followed variable patterns of multi­ AmpliSeq™ Ancestry Panel (TFS; Waltham, MA, USA). Genomic DNA
directional introgression according to social and historical conditions targets were amplified in a final reaction volume of 20 μL containing 1
and significantly vary for each distinct geographical region [12]. Het­ μL of template DNA (1 ng), 4 μL of 5x Ion AmpliSeq™ Hi-Fi Mix, 10 μL of
erogeneous processes of migratory flows led to marked divergences in 2x Ion AmpliSeq™ primer pool (Ancestry Panel), and 5 μL of nuclease-
regional ethnical composition, and distinctive proportions of parental free water. PCR reaction was performed in a Veriti 96-well Thermal
populations (NAM, EUR, and AFR) contribution in present-day geopo­ Cycler (TFS; Waltham, MA, USA), under following conditions: enzyme
litical regions are noticeable [13]. Rio Grande do Sul (RS) is the activation at 99 ◦ C for 2 min, 21 cycles at 99 ◦ C for 15 s and at 60 ◦ C for 4
southernmost State of Brazil, with a current estimate of approximately min, and holding at 10 ◦ C. PCR amplicons were partially digested with 2
11 million inhabitants. The history of RS is peculiar since its effective μL FuPa reagent and incubated at 50 ◦ C for 10 min, 55 ◦ C for 10 min,
colonization started in the 18th century only. At the time first Europeans 60 ◦ C for 20 min, and held at 10 ◦ C for up to 1 h. Adapters ligation was
arrived, the region was inhabited by Native Americans identified basi­ performed by adding to the 22 μL of digested amplicon: 4 μL of Switch
cally with three major groups: (1) Guarani; (2) Kaingang; and (3) Solution, 0,5 μL of Ion P1 Adapter, 0,5 μL of Barcode X (X was chosen
Pampean tribes [14]. African contingent established in south Brazil from Ion Xpress™ Barcode Adapters 1–96 Kit or IonCode™ Barcode
seems to have come mostly from South and East African coasts (current Adapters 1–384 Kit for different samples), 1 μL of nuclease-free water, 2
Angola and Mozambique), as well as from the West-Central African re­ μL of DNA ligase and incubated at 22 ◦ C for 30 min, 72 ◦ C for 10 min,
gion [15]. From the 19th century onwards, large inflows of Germans and and held at 10 ◦ C for up to 1 h. After barcode adapters ligation, libraries
Italians gradually transformed the RS profile, shaping its population were purified with 45 μL of 1.5x Agencourt® AMPure® XP Reagent
with one of the highest European ethnic composition of the country. (Beckman Coulter, FL, USA) and washed two times using freshly pre­
The present study characterizes the 165 SNPs included in the Pre­ pared 70% ethanol (EtOH), according to manufacturer’s instructions.
cision ID Ancestry Panel (TFS; Waltham, MA, USA) in four main RS State To assess yield and subsequent normalization, diluted libraries (9 μL
(southern Brazil) population groups (also termed “Gauchos”). We at 1:100 dilution) were quantified using a 7500 Real-Time PCR System
analyzed forensic parameters and conducted population structure ana­ (TFS; Waltham, MA, USA) with Ion Library TaqMan™ Quantitation Kit
lyses among the four population groups apart and along with 89 refer­ (TFS; Waltham, MA, USA). Then multiple libraries diluted to 20 pM were
ence worldwide populations, aiming to scrutinize genetic diversity, pooled in equivolume for template preparation.
similarity levels, ancestry inference, and population stratification of A 25 μL sample of the pooled library was added to the amplification
investigated population groups. solution to originate template-positive Ion Sphere Particles (ISPs).
Emulsion-based clonal amplification (emPCR) was performed on Ion
2. Materials and methods OneTouch™ 2 Instrument (TFS; Waltham, MA, USA) with Ion PGM™
Hi-Q™ View OT2 Kit (TFS; Waltham, MA, USA). Template-positive ISPs
2.1. Ethical Statement were enriched on Ion OneTouch™ Enrichment System (TFS; Waltham,
MA, USA). Both emPCR and enrichment were conducted following the
All samples analyzed in this study were obtained from voluntary manufacturer’s protocol (Revision A.0) [17].
donors following informed consent. This work is in accordance with Controls and sequencing primers were added to enriched, template-

2
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

positive ISPs. Sequencing was run on Ion Torrent™ PGM™ Instrument kits [21,22]. All genotypes and base calls were manually checked by at
(TFS; Waltham, MA, USA) using an Ion PGM™ Hi-Q™ Sequencing Kit least two independent reviewers.
(TFS; Waltham, MA, USA) and an Ion 318™ Chip v2 (TFS; Waltham,
MA, USA). A final volume of 30 μL was loaded per chip, according to the 2.5.2. MiSeq® System
manufacturer’s instructions (Revision C.0) [18]. Two chips with BaseSpace™ Sequence Hub DNA Amplicon v2.0 App was used to
approximately 65 samples each were used in distinct runs for complete analyze the AmpliSeq™ Custom DNA Panel. Per-sample reads (FASTQ
sample set genotyping. files) were aligned with the BWA algorithm against the reference
genome (Homo sapiens GRCh38). Variant calling was performed by
2.4. Library preparation, quantification, and sequencing – AmpliSeq™ Pisces Variant Caller at a Depth Filter level of 10 and annotated by
Custom DNA Panel for Illumina® Illumina Annotation Engine using RefSeq transcripts. A VCF file con­
taining variants of interest was uploaded to the project for SNP geno­
Primers were designed by BaseSpace™ DesignStudio™ Sequencing types calling.
Assay Designer Software (Illumina, CA, USA), using AmpliSeq DNA
Hotspot and GRCh38.p2 as reference human genome, at high stringency, 2.5.3. Low-pass full genome sequencing
a maximum amplicon length of 375 bp, and 100% coverage for the same A subset of samples comprising 50 individuals was subjected to full
165 target SNPs included in the HID-Ion AmpliSeq™ Ancestry Panel. genome sequencing through an external service provider (Gencove Inc.,
Genomic DNA of 68 samples was diluted to 10 ng as standard input NY, USA). Full sequencing was attained with 1x coverage on an Illumina
recommended by the manufacturer’s protocol (Document # NextSeq 2000 equipment (Illumina, CA, USA) following library prepa­
1000000036408 v08) [19]. Library preparation was performed using ration and workflow according to the company’s internal procedures,
AmpliSeq™ Library PLUS for Illumina® and AmpliSeq™ Custom DNA including sequencing protocols and data processing, as described by
Panel for Illumina®. Genomic DNA targets were also amplified in a final Wasik and collaborators [23]. Results were provided as data files with
reaction volume of 20 μL, but with 6 μL of template DNA (10 ng), 4 μL of different formats and were extracted from provided VCF files using a
5x AmpliSeq™ Hi-Fi Mix, and 10 μL of 2x AmpliSeq™ Custom DNA custom python script. In these files, genetic data is displayed as genotype
Panel. PCR reaction was also performed in a Veriti 96-well Thermal posterior probabilities, since the bioinformatics pipeline adopted by
Cycler, but under following parameters: enzyme activation at 99 ◦ C for Gencove includes an imputation step based on the model proposed by Li
2 min, 18 cycles at 99 ◦ C for 15 s and at 60 ◦ C for 8 min and holding at and Stephens [24], to predict variants located in low coverage regions or
10 ◦ C. Changes in time and cycles’ number considered the 375 bp undetected during sequencing. A threshold value of 0.98 for the geno­
amplicon length. Amplicons were partially digested similarly to Preci­ type probability was adopted to reduce errors, and the genotype calls
sion ID Ancestry Panel’s library preparation. Indexes I7 and I5 ligation rate under the adopted threshold was less than 0.5% (evenly distributed
to each sample was conducted using Ampliseq™ CD Indexes Set A for among all 165 SNPs, with no preferential sites for unreliable calls).
Illumina®, by adding to the 22 μL of digested amplicon: 4 μL of Switch Genotype calls with reported posterior probability under 0.98 were
Solution, 2 μL of AmpliSeq CD Indexes, 2 μL of DNA Ligase, and incu­ assigned as missing data.
bated at 22 ◦ C for 30 min, 68 ◦ C for 5 min, 72 ◦ C for 5 min, and held at The conversion of exported SNP genotypes data to downstream
10 ◦ C for up to 24 h. Libraries were purified with 30 μL AMPure® software formats was done by PGDSpider v.2.1.1.5 [25]. Allele fre­
magnetic beads and washed twice with freshly prepared 70% EtOH. A quencies of 165 SNPs and corresponding forensic statistical parameters,
second amplification step was prepared to guarantee a sufficient library including observed heterozygosity (Ho), expected heterozygosity (He),
quantity for sequencing on MiSeq® System, as follows: to each library polymorphism information content (PIC), match probability (MP),
well were added 45 μL of 1x Lib Amp Mix and 5 μL of 10x Lib Amp power of discrimination (PD), power of exclusion (PE), and typical pa­
Primers and incubated at 98 ◦ C for 2 min, then 7 cycles of 98 ◦ C for 15 ternity index (TPI) were calculated using STR Analysis for Forensics
min and 64 ◦ C for 1 min, and held at 10 ◦ C. Subsequently, libraries were (STRAF) v.1.0.5 [26] online software (available at http://cmpg.unibe.ch
subjected to two purification steps using AMPure® magnetic beads and /shiny/STRAF/). Random match probability (RMP) calculations were
freshly prepared 70% EtOH. performed with validated, in-house Excel-based workbooks. Exact test of
Qubit™ 2.0 Fluorometer and Qubit™ dsDNA HS Assay Kit were used Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium test
to quantify the libraries. Next, libraries were diluted to starting con­ (LD) were performed using Arlequin v3.5.2.2 [27]. HWE analysis was
centration (2 nM) and pooled with 10 μL of each, afterward denatured carried out with 1000,000 Markov Chain Monte Carlo (MCMC) steps
with 0.2 N NaOH and diluted to final loading concentration of 9 pM and 1000,000 dememorization steps. Correction for multiple testing was
following manufacturer’s instructions (Document # 15039740 v10) done according to the method suggested by Bonferroni [28], by dividing
[20]. Sequencing was performed using the MiSeq® Reagent Kit v2 the significance level of 0.05 by the number of tests.
(500-cycles) on a MiSeq® System instrument (Illumina, CA, USA).
2.5.4. Data merging and population analyses
2.5. Sequencing data analysis For comprehensive analyses of populations’ genetics relationships,
our data were combined with genotypic profiles from 89 reference
2.5.1. Ion Torrent™ PGM™ worldwide populations for the 165 AISNPs included in Precision ID
Signal processing (DAT files), base calling, and unmapped and Ancestry Panel (TFS; Waltham, MA, USA). 24 populations were extrac­
mapped BAM files generation (Homo sapiens hg19 as reference genome ted from the 1000 Genomes (1 kG) Project [29] Phase III and merged
to perform alignment) were conducted using Torrent Suite™ Software with previously published Basques [30] and Chinese Uyghur and Hui
v5.0 (TFS; Waltham, MA, USA). Coverage Analysis v5.0 and Torrent [31]. Danes and Somalis’ [32] genotypic data were kindly provided by
Variant Caller v5.0 plugins were used to calculate the number of mapped Professor Niels Morling and collaborators. 60 worldwide populations
reads and perform variant calling, respectively. SNP genotypes were [33] genotyped at Kidd Lab and kindly provided by Professor Kenneth K.
called under standard analysis settings by HID SNP Genotyper v4.3.2 Kidd and collaborators also compose reference populations set. Details
plugin, which allows genotypes filtering at specific locations, given in of populations used in the present study and their abbreviations are
the hotspot file (here the 165 SNPs that compose Precision ID Ancestry listed in Supplementary Table S1.
Panel). Minimum coverage was set for six reads per base position, and A population differentiation test based on pairwise FST genetic dis­
heterozygote allelic call followed a maximal 70/30 unbalance rate, tances and molecular variance analysis (AMOVA) among our four
considering previous studies where the occurrence of allelic unbalance studied population groups and along with 89 worldwide populations
was observed in some genetic markers for HID Ion Ampliseq Precision were performed using Arlequin v3.5.2.2. Based on pairwise FST values,

3
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

the Multidimensional Scaling (MDS) technique was applied using IBM® seven loci pairs displayed linkage disequilibrium in pairwise LD testing,
SPSS® Statistics v25.0 [34]. Individual ancestry proportions were even after Bonferroni correction for multiple comparisons (p < 3.70 ×
evaluated using STRUCTURE v.2.3.4 [35], with ten independent runs 10− 6): three pairs in AFRS (two of them also genetically associated in
for each K value, ranging from K = 2 to K = 20. 100,000 burn-in steps ADRS), three in EURS, and an extra pair in Admixed-derived Gauchos
followed by 100,000 MCMC repetitions were applied, and ‘admixture’ (Table 1). Four out of seven pairs are located on the same chromosome,
and ‘correlated allele frequencies’ models were considered [36]. Sum­ up to 3.5 cM apart from each other: rs1834619–rs1876482 (Chr. 2),
mation and graphical representation of STRUCTURE results were rs260690–rs3827760 (Chr. 2), rs1426654–rs735480 (Chr. 15), and
generated using Cluster Markov Packager Across K (CLUMPAK) online rs3916235–rs4891825 (Chr. 18). The latter pair also had LD statistical
server (available at http://clumpak.tau.ac.il/) [37]. To identify the K significance in Basques [30], Danes, and Somalis [32]. Overall, full
value that captures the uppermost structure level, we used Structure recombination and independent inheritance are expected in loci with a
Harvester v.0.6.94 [38], which implements the Evanno method [39]. genetic distance of over 50 cM [50]. Nevertheless, the aforementioned
statistically associated SNPs are located at markedly shorter distances.
3. Results and discussion Besides physical linkage between loci, such non-random associations
can be caused by, among other reasons, gene flow among populations
Three distinct sequencing procedures were adopted to generate ge­ with dissimilar allele frequencies, population structure, and small
netic profiles of 165 AISNPs in 250 unrelated South Brazilian subjects. A sample sizes [51]. Brazilian populations display varying levels of strat­
further study evaluating the comparative sequencing performance of ification and complex admixing patterns [52]; therefore, a conjunction
these methods is underway. The employed panel comprises 55 auto­ of the foregoing factors is presumably inducing the genetic associations
somal biallelic SNPs from AIM set developed by Kidd group [6,33] and observed between seven loci pairs in three RS population groups. LD test
123 from Seldin’s AIM set [7] (13 markers are included in both panels; p-values are detailed in Supplementary Table S8-S11. AMRS population
see Supplementary Table S2 for SNPs details) and aims to provide presented no significant association among loci after Bonferroni
biogeographic ancestry information to guide investigative processes. correction.
The commercial kit was designed to properly handle degraded DNA Observed heterozygosity (Ho) ranges from 0.048 (rs1229984 and
samples, with targeted amplicons average size less than 130 bp. Several rs4471745) to 0.661 (rs1040045) in AFRS, from 0.011 (rs3811801) to
populations have been investigated using this panel to infer genomic 0.576 (rs3784230) in EURS, from 0.045 (rs7251928 and rs7722456) to
ancestry and population stratification, including Asians (Uyghur and 0.727 (rs7745461 and rs948028) in AMRS, and from 0.095 (rs1229984)
Hui [31], Japanese [40], Chinese Tibetan-Burmese [41], Uyghur and to 0.662 (rs1871428) in ADRS, with average values of 0.361 ± 0.133,
Kazakh [42,43], and other Asian populations [44]), Europeans (Basques 0.274 ± 0.144, 0.355 ± 0.146, and 0.388 ± 0.116, respectively. As
[30], Danes [32], and Greenlanders [45]), South Americans (Ecua­ expected, the Admixed-derived group (ADRS) has the highest intra­
dorians [46]), Middle Eastern populations (Turks and Iranians [47]) and populational genetic diversity average, followed by the African-derived
Africans (Somalis [32]). In the study herein, samples obtained from one. Noteworthy, heterozygosity values indicate greater miscegenation
individuals belonging to the three main ethnicities of Rio Grande do Sul among the South Brazilian Amerindians compared to the European-
(RS) State, southern Brazil, as well as subjects with multiethnic back­ derived population group. These results reflect the admixed landscape
grounds, were firstly investigated to explore genetic relationships and characterizing the Brazilian population and corroborate previous find­
structures within and among them. Subsequently, population stratifi­ ings regarding their genetic variability in RS population and other
cation analysis and individual ancestry inference were conducted Brazilian regions [53–56]. The SNP with the highest discrimination
regarding reference populations set. power was rs3916235 (PD = 0.658; MP = 0.342) in AFRS, rs459920 (PD
= 0.661; MP = 0.339) in EURS, rs37369 (PD = 0.6653; MP = 0.3347) in
3.1. Forensic parameters of 165 SNPs for Rio Grande do Sul (Brazil) AMRS, and rs7554936 (PD = 0.6622; MP = 0.3378) in ADRS. Combined
main population groups match probability (CMP) was, in the same order as groups above, 2.45 ×
10− 51, 8.62 × 10− 40, 1.20 × 10− 48, and 8.82 × 10− 56. In African-derived
The detailed 165 AISNPs genotypes of 250 Brazilian subjects from RS
are listed in Supplementary Table S2. Observed allele frequencies and
forensic parameters estimates of these SNPs, including Ho, He, PIC, MP, Table 1
Genetically associated SNP pairs in RS State (Brazil) main population groups.
PD, PE, and TPI for individual population groups are presented in
Four out of seven pairs are located on the same chromosome (position based on
Supplementary Table S3-S6, as well as p-values for HWE tests for all loci.
hg19 genome). P-values for linkage disequilibrium tests are also provided.
Two loci (rs1800414 and rs671) are monomorphic in all four pop­
ulations investigated. rs3811801 is monomorphic in AFRS, AMRS, and Locus #1 Locus #1 Locus #2 Locus #2 P-value
location location LD
ADRS subsets. rs1871534, rs3916235, and rs7657799 are monomorphic
only in Amerindian-derived individuals. Invariable loci rs1800414, AFRS rs1572018 Chr13: rs2166634 Chr10: 1.76 ×
41715282 118436068 10− 06
rs671, and rs3811801 were also monomorphic for the same alleles in
rs1834619 Chr2: rs1876482 Chr2: 7.07 ×
Basques [30], Danes, Somalis [32], Greenlanders [45], and Ecuadorians 17901485 17362568 10− 09
[46]. Further inquiries at the Ensembl Genome Browser (Release 99) rs3916235 Chr18: rs4891825 Chr18: 1.74 ×
showed that these three markers have the same fixed allele in all Eu­ 67578931 67867663 10− 09
ropean, Native American, and African samples reported to date, while EURS rs1407434 Chr1: rs3827760 Chr2: 1.03 ×
186149032 109513601 10− 06
are polymorphic in East Asian populations. Therefore, the lack of genetic rs260690 Chr2: rs3827760 Chr2: 1.11 ×
variability in these loci should not be extrapolated to the Brazilian 109579738 109513601 10− 11
population as a whole, as the sampling of this study was conducted in a rs4471745 Chr17: rs731257 Chr7: 2.01 ×
single Brazilian federative unity (out of 27), particularly the one pre­ 53568884 12669251 10− 06
ADRS rs1426654 Chr15: rs735480 Chr15: 6.22 ×
senting the lowest rate of Asian ethnic composition, as reported by
48426484 45152371 10− 07
Brazilian Institute of Geography and Statistics (IBGE) demographic rs1834619 Chr2: rs1876482 Chr2: 4.22 ×
census [48]. These loci are expected to be variable in samples from 17901485 17362568 10− 08
southeastern Brazil, for instance, given the historical presence of Asian rs3916235 Chr18: rs4891825 Chr18: 1.66 ×
immigrants in this particular region [49]. 67578931 67867663 10− 11

No statistically significant deviation from HWE was observed after AFRS = African-derived Gauchos; EURS = European-derived Gauchos; ADRS =
Bonferroni correction (p > 3.03 × 10− 4) in any ethnic subset. However, Admixed-derived Gauchos.

4
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

Gauchos, the SNP with the highest power of exclusion was rs1040045 Table 3
(PE = 0.3710), while in EURS was rs3784230 (PE = 0.2632). In AMRS Pairwise FST test for RS State (Brazil) main population groups based on 165 SNPs
population, SNPs with the highest PE were rs948028 and rs7745461, of Precision ID Ancestry Panel (TFS; Waltham, MA, USA). FST values are pre­
both with a PE value of 0.4717. Combined power of exclusion (CPE) of sented in lower-left diagonal, while upper-right diagonal exhibits the signifi­
165 SNPs included in Precision ID Ancestry Panel was, for AFRS, EURS, cance matrix (p = 0.00000).
AMRS, and ADRS: 99.99999960%, 99.99954437%, 99.99999967%, and Population AFRS EURS AMRS ADRS
99.99999995%, respectively. CMP and CPE metrics could be regarded as AFRS + + +
indicators to evaluate the efficiency of genetic markers in forensic EURS 0.26051 + +
individualization. Forensic descriptive parameters of Precision ID AMRS 0.30261 0.38631 +
ADRS 0.07191 0.07702 0.23357
Ancestry Panel (TFS; Waltham, MA, USA) estimated for each population
group revealed that, although its primary purpose is biogeographic AFRS = African-derived Gauchos; EURS = European-derived Gauchos; ADRS =
ancestry inference (whereas for an identification tool it is more suitable Admixed-derived Gauchos.
to use other panels, for instance, the Precision ID Identity Panel [57]), Significant values were represented by “+ ” signal.
this panel has enough polymorphic and informative SNPs to be used as a
supplementary instrument for individual identification in the forensic Furthermore, the trihybrid multiethnic group displays tighter (and quite
analytical repertoire. similar) genetic relationships with African-derived and European-
Moreover, average random match probability (RMP) based on indi­ derived population groups, and more distant (albeit closer than
vidual genotypic frequencies for all 165 SNPs was calculated for AFRS, others) with the Amerindian one. These findings contrast with the re­
EURS, AMRS, and ADRS populations and for RS State as a whole (RSBR). sults of a previous study concerning color and genomic ancestry in
For the latter, an adjusted allele frequencies table was generated Brazilians [59], in which no statistically significant degrees of genetic
considering the relative contribution of each aforementioned group in differentiation was observed among individuals classified as Whites,
RS population formation, according to IBGE demographic census [58] Intermediates, and Blacks from São Paulo city, southeastern Brazil, by
(Supplementary Table S7). Results are presented in Table 2. A rather typing of 12 STR loci. The use of forensic STRs for delineating popula­
significant overlap between ADRS RMPs in the three main ethnic pop­ tion structure may explain disparities found, as markers with relatively
ulations (EURSPop., AFRSPop., and AMRSPop.) can be observed, corrobo­ lower mutation rates (SNP, Alu, Indel) are more suitable to provide
rating a trihybrid composition to the admixed nature of this population biogeographic resolution at continental level [60].
sample. On the other hand, the average probabilities of AFRS, EURS, and Furthermore, pairwise FST values were calculated based on Precision
AMRS genetic profiles to occur in populations other than their own (and ID Ancestry Panel (TFS; Waltham, MA, USA) SNPs among RS main
ADRSPop., for AFRS and EURS profiles) are at least 25 orders of magni­ population groups and 89 reference worldwide populations (see Sup­
tude lower. Furthermore, EURS and ADRS are the most frequent genetic plementary Table S1 for details). Results are displayed as a heatmap in
profiles found in RSBR population. Wright’s F-statistics (discussed later) Supplementary Fig. S1 and pairwise FST values are detailed in Supple­
shed light on the above outcomes regarding forensic aspects of the four mentary Table S12. African-derived Gauchos showed higher similarity
RS population groups. levels with African Americans (AFRS–ASW: FST = 0.0162; AFRS–AAM:
FST = 0.0177), followed by Eastern African Somalis (AFRS–SOM: FST =
3.2. Interpopulation genetics analyses 0.0372) and Ethiopian Jews (AFRS–ETJ: FST = 0.0434), and more con­
spicuous divergence with Native Americans Suruí and Karitiana from
Based on Precision ID Ancestry Panel (TFS; Waltham, MA, USA), Amazon region (AFRS–SUR: FST = 0.4599; AFRS–KAR: FST = 0.4366).
pairwise FST for RS main population groups ranged from 0.07191 (AFRS European-derived Gauchos, on the other hand, presented more genetic
and ADRS) to 0.38631 (EURS and AMRS). Table 3 presents results ob­ proximity with Central and Southern Europe populations (EURS–HGR:
tained with pairwise FST test in investigated population groups. Overall, FST = 0.0017; EURS–GRK: FST = 0.0049; and EURS–TSI: FST = 0.0055)
Amerindian population was found to be the most genetically distinct and succeeded by European Americans (EURS–EAM: FST = 0.0058), and
structured, with consistently higher observed pairwise FST values, fol­ highest differentiation levels with Native American Suruí (EURS–SUR:
lowed by European, African, and Admixed ethnicities, respectively. FST = 0.5325) and Biaka, pygmies from Central Africa (EURS–BIA: FST =
Considering the 165 ancestry-informative markers evaluated, there is a 0.5309). Brazilians with Amerindian ethnicity from RS State displayed
remarkable genetic differentiation level among population groups more prominent genetic similarity with Peruvians (AMRS–PEL: FST =
derived from the three main parental populations that bolstered the 0.0201), Maya (AMRS–MAY: FST = 0.0238), and Quechua (AMRS–QUE:
peopling of Brazil (Europeans, Africans, and Native Americans). FST = 0.0269), followed by North American Plains Amerindians
(AMRS–NPA: FST = 0.0494), corroborating the admixed nature of AMRS
Table 2
population group. Higher divergence levels were with Biaka pygmies
Average random match probability (RMP) of genetic profiles from each RS State and Western Africans (AMRS–BIA: FST = 0.5991; AMRS–ESN: FST =
population group in each ethnic population and in RS population as whole 0.5660; AMRS–YOR: FST = 0.5622). Admixed-derived Gauchos, char­
(RSBR; adjusted allele frequencies), based on allele frequencies of the 165 SNPs acterized by miscegenation among two or three of Brazilian main ethnic
included in Precision ID Ancestry Panel (TFS; Waltham, MA, USA). roots (European, African, and Amerindian), revealed higher similarity
AFRSProf. EURSProf. AMRSProf. ADRSProf. with Puerto Ricans and Colombians (ADRS–PUR: FST = 0.0128;
ADRS–CLM: FST = 0.0267), and more evident population differentiation
AFRSPop. 3.62E-50 ± 4.18E-82 ± 1.16E-88 ± 8.47E-60 ±
2.03E-49 2.88E-81 9.05E-88 6.42E-59
levels with Native Americans from Amazon region (ADRS–SUR: FST =
EURSPop. 7.89E-75 ± 7.67E-41 ± 1.37E-95 ± 7.67E-54 ± 0.4005; ADRS–KAR: FST = 0.3744).
7.48E-74 5.40E-40 1.30E-94 7.15E-53 To further investigate the above results regarding interpopulation
AMRSPop. 1.63E-88 ± 7.51E-84 ± 2.36E-48 ± 4.99E-70 ± genetic relationships of RS State main ethnicities and 89 worldwide
7.39E-88 3.44E-83 1.07E-47 2.29E-69
populations, an MDS plot was drawn based on pairwise FST values
ADRSPop. 2.15E-56 ± 1.56E-48 ± 3.35E-87 ± 3.14E-55 ±
1.50E-55 1.34E-47 2.85E-86 2.60E-54 (Fig. 1). MDS graph exhibits positive values in Dimension 1 as a char­
RSBRPop. 1.27E-72 ± 6.23E-44 ± 8.65E-78 ± 3.05E-49 ± acteristic feature for African (AFR) populations. Sub-Saharan African
7.44E-72 3.07E-43 3.97E-77 2.61E-48 populations are closely clustered at bottom-right edge of the quadrant,
AFRS = African-derived Gauchos; EURS = European-derived Gauchos; ADRS = while admixed AFR populations have broader dispersion along the axis.
Admixed-derived Gauchos. AFRS population is relatively close to admixed East African populations
Prof.
= Profile; Pop. = Population. (SOM and ETJ) and African Americans (AAM and ASW) in Dimension 1

5
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

Fig. 1. Genetic distances evaluation among RS State (Brazil) main population groups and 89 worldwide populations, presented as an MDS plot based on pairwise FST
values for 165 SNPs included in Precision ID Ancestry Panel (TFS; Waltham, MA, USA). Genetic distances between all pairs of populations were included, and multi-
dimension scaling procedure was applied to reduce dimensionality, from an n-dimensional space to a Cartesian space. Spatial proximity in the plot indicates genetic
similarity between populations, while distant populations tend to be located apart from each other.

median positive values. The multiethnic Gaucho group (ADRS) is plotted


Table 4
between the European and African clusters, indicating the substantial
Fixation indexes and AMOVA results for RS State (Brazil) main population
presence of these two higher components. This wide-distribution phe­
groups solely (dataset #1) and along with 89 worldwide populations (dataset
nomenon is also observed in other Latino populations (PUR, CLM, MLX, #2), based on individual genotypes of Precision ID Ancestry Panel (TFS; Wal­
and PEL), probably reflecting their admixed nature. Southern and tham, MA, USA). Statistically significant fixation indexes are highlighted in bold
Northern European (EUR) populations are clustered in Dimension 1 (p ≅ 0.00000).
negative values. Even with a close disposition, it is feasible to distinguish
Dataset Source of Relative Fixation
both EUR regions, besides observing a nearness of Southern EUR with Variation Variation (%) Indexes
some Middle Eastern populations. European-derived Gauchos subset, as
#1 Four main population Among 18.66 FST
well as European Americans (EAM), are located in the EUR cluster. groups from RS State populations = 0.18662
However, EAM is grouped along with Central and Northern EUR pop­ (Brazil) Among individuals 0.61 FIS
ulations, while EURS adjoins Southern EUR ones. Asian populations are within populations = 0.00756
spread across both quadrants of Dimension 2, comprising negative Within 80.72 FIT
individuals = 0.19277
values (Middle Eastern populations and Southern Asians), median-
#2 93 worldwide Among groups 26.85 FCT
positive values (Central, Northern, and Eastern Asians), and higher populations = 0.26848
positive values (Native American (NAM) populations). Amerindian- Among 2.94 FSC
derived Gaucho group is contiguous to NA cluster, close to MAY, NPA, populations within = 0.04016
groups
and QUE populations.
Among 29.79 FST
Table 4 shows AMOVA testing results for two datasets: (1) four main populations = 0.29792
population groups from RS State (AFRS, EURS, AMRS, and ADRS) as Among individuals 0.68 FIS
independent populations; (2) the 93 populations assembled according to within populations = 0.00968
geographic location. Considering RS population groups only, among- Within individuals 69.53 FIT
= 0.30466
populations covariance component accounts for an estimate of 18.7%
of entire genetic differentiation, whereas 81.3% is due to individual-
level divergence. AMOVA results for 93 worldwide populations level. These results support the convenience of using this panel as a
revealed that 29.8% of genetic differentiation is justified by among- supplementary instrument for individual identification in the forensic
population variance, while 70.2% is due to variation at the individual field.
level. Noteworthy, although there is a significant genetic structuring
level among African, European, and Amerindian-derived Gaucho sub­
populations, the most comprehensive source of variability is still the 3.3. Ancestry inference
individual. Precision ID Ancestry Panel (TFS; Waltham, MA, USA) was
designed to identify population genetic structures for ancestry inference To further characterize the genetic structure of RS State main pop­
purposes; however, it can also access genetic diversity at the individual ulation groups, ancestry resolution and admixture patterns estimates at
populational and individual levels were assessed using Bayesian

6
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

inference methods, based on 5000 individuals from 93 worldwide 33.7–86.1%, EUR: 1.5–51.3%, and NAM: 0–52.4%. EURS displays a very
populations. Fig. 2 presents populational bar charts of estimated cluster similar clustering pattern to that of the North American counterpart
membership values from STRUCTURE runs for Brazilian samples (EAM) and European populations, with an almost total predominance of
alongside 89 reference populations. Estimates are based on individual European component. Besides, a low NAM/Asian ancestry is also
genotypes for all 165 ancestry-informative SNPs composing Precision ID noticeable, and almost none AFR component is perceived. Indeed,
Ancestry Panel (TFS; Waltham, MA, USA). The optimal number of within EURS, ancestry proportions vary from AFR: 0–1.4%, EUR:
clusters according to Evanno method is K = 3, although higher K values 65.9–99.5%, and NAM: 0–33.2%. AMRS, as the Peruvians (PEL), ex­
successfully partitioned the populations into further continental (or hibits an expressive NAM/Asian component. Individually, AMRS
even more geographically refined) divisions. When considering runs ancestry proportions range from 0% to 9.8% (AFR), 0–44.2% (EUR), and
ranging from K = 5–20, Structure Harvester results indicate K = 7 as 55–99.4% (NAM). ADRS presents a well-defined admixed pattern, with
optimal K number (Supplementary Fig. S2). At K = 2 (data not shown) the three ancestry components clearly discernible. There is a prevalence
African and non-African ancestry components could be identified. At K of EUR composition, followed by AFR and NAM/Asian, respectively,
= 3, African (blue), European (green), and Native American/Asian (red) corroborating results obtained with the MDS chart. At individual level,
ancestry components are discernible. ancestry proportions vary from AFR: 0–61.8%, EUR: 19.0–89.9%, and
At optimal K number of 3, AFRS presents clustering patterns similar NAM: 0–43.8%. Average ancestry estimates of RS State population
to adjoining African-American subpopulations (ACB, AAM, and ASW), samples (AFRS, EURS, AMRS, and ADRS) were inferred based on both
although the green (EUR) and red (NAM/Asian) components are more optimal K values and are presented in Table 5. Results were extracted
pronounced, suggesting a higher admixing level among the parental from runs with the largest Ln Probability Data [LnP(D)].
populations that originated African-derived Gauchos than in African At K = 7, 8, and 9, the Central African, North African/Middle
Americans. Within the AFRS, ancestry proportions range from AFR: Eastern, Central and North Asia, and Pacific ancestry components

Fig. 2. Population structure of RS State (Brazil) main population groups along with 89 worldwide populations, based on 165 SNPs included in Precision ID Ancestry
Panel (TFS; Waltham, MA, USA). STRUCTURE plots are presented with cluster (K) number ranging from 3 to 9 (top to bottom; data for K = 6 and K = 8 not shown).
The optimal number of clusters was three. Each vertical line stands for an individual, with colors representing the relative proportion of association with each
inferred cluster. Populations referring to each number and respective geographic locations are listed in Supplementary Table S1.

7
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

Table 5 Declaration of Competing Interest


Ancestry estimates for RS State (Brazil) main population groups for three and
seven clusters (K) using 89 worldwide populations as references. Authors declare they have no conflict of interest.
K=3

Ancestry: AFRS EURS AMRS ADRS Acknowledgements


African 0.620 0.018 0.026 0.268
European 0.240 0.946 0.089 0.570 The authors would like to express their appreciation for the valuable
Native American 0.140 0.036 0.885 0.162
contribution of the volunteers who participated in this study.
K=7
Ancestry: AFRS EURS AMRS ADRS
W. African 0.365 0.007 0.017 0.152
C. African 0.255 0.007 0.015 0.110 Financial support
C. N. European 0.113 0.688 0.066 0.288
SW. Asian/Mediterranean 0.102 0.249 0.057 0.252 The present work was funded by grants provided by the Coordenação
S. Asian 0.052 0.021 0.022 0.067
E. Asian 0.023 0.011 0.015 0.023
de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) –
Native American 0.091 0.017 0.809 0.107 Finance Code 001, Conselho Nacional de Pesquisa Científica – Brasil
(CNPq) - Chamada n◦ 16/2014 - 465450/2014-8, and Fundação de
Amparo à Pesquisa do Estado do Rio Grande do Sul – Brasil (FAPERGS –
become noticeable. Noteworthy, K = 7 distinguishes three Central Af­ 17/2551-0000520-1).
rican (“pygmy”) populations from the other Sub-Saharan Africa pop­
ulations, which now display partial membership to two different
Appendix A. Supporting information
clusters. Furthermore, there is a conspicuous transition from North Af­
rica to Southwest Asia, then to Southern Europe, and finally to Northern
Supplementary data associated with this article can be found in the
Europe [61]. Accordingly, Southern Europeans present a "Mediterra­
online version at doi:10.1016/j.fsigen.2023.102838.
nean" component and partial assignment to the cluster that is essentially
Northern European. The "Mediterranean" component becomes visible in
EURS subpopulation, but not in the North American counterpart (EAM), References
reflecting the unique settlement process of each region.
[1] M. Kayser, Forensic DNA phenotyping: predicting human appearance from crime
scene material for investigative purposes, Forensic Sci. Int. Genet 18 (2015) 33–48.
4. Conclusion [2] M. Kayser, P. De Knijff, Improving human forensics through advances in genetics,
genomics and molecular biology, Nat. Rev. Genet 12 (3) (2011) 179–192.
[3] C. Phillips, Forensic genetic analysis of bio-geographical ancestry, Forensic Sci. Int.
In the present study, 250 individuals from RS State, Brazil, appor­ Genet 18 (2015) 49–65.
tioned in four main Brazilian population groups were genotyped for 165 [4] C. Børsting, N. Morling, Next generation sequencing and its applications in forensic
AISNPs included in Precision ID Ancestry Panel using massively parallel genetics, Forensic Sci. Int. Genet 18 (2015) 78–89.
[5] B. Bruijns, R. Tiggelaar, H. Gardeniers, Massively parallel sequencing techniques
sequencing technology. Although the main purpose of the Precision ID for forensics: a review, Electrophoresis 39 (21) (2018) 2642–2654.
Ancestry Panel is the biogeographical ancestry inference, forensic [6] K.K. Kidd, W.C. Speed, A.J. Pakstis, et al., Progress toward an efficient panel of
effectiveness analysis revealed that the panel could be applied as a SNPs for ancestry inference, Forensic Sci. Int. Genet 10 (2014) 23–32.
[7] R. Kosoy, R. Nassir, C. Tian, et al., Ancestry informative marker sets for
supplementary approach in forensic individual identification and determining continental origin and admixture proportions in common populations
kinship testing regardless of ethnicity. However, the use of commercial in America, Hum. Mutat. 30 (1) (2009) 69–78.
solutions specifically designed as identification tool applicable to chal­ [8] F.M. Salzano, M. Sans, Interethnic admixture and the evolution of Latin American
populations, Genet Mol. Biol. 37 (2014) 151–170.
lenging samples, such as the Precision ID Identity Panel [57], is better [9] R.R. Moura, A.V. Coelho, V. Balbino, S. Crovella, L.A. Brandão, Meta-analysis of
suited for this purpose. Investigation of genetic differentiation among Brazilian genetic admixture and comparison with other Latin America countries,
the four RS population groups shows evidence of a significant genetic Am. J. Hum. Biol. 27 (5) (2015) 674–680.
[10] P.D. Curtin, The Atlantic slave trade: A census, University of Wisconsin Press,
structuring degree essentially among the three ethnicities directly
Madison, WI, 1969.
derived from parental populations (AFRS, EURS, and AMRS). Therefore, [11] S.M. Callegari-Jacques, D. Grattapaglia, F.M. Salzano, et al., Historical genetics:
BGA inference could be informative in the context of subpopulations spatiotemporal analysis of the formation of the Brazilian population, Am. J. Hum.
with varying levels of genetic stratification and admixture patterns, Biol. 15 (6) (2003) 824–834.
[12] IBGE, Brasil: 500 anos de povoamento, Instituto Brasileiro de Geografia e
although it should be used with caution and, preferably, associated with Estatística, Rio de Janeiro, 2007.
direct externally visible characteristics predictive markers. Population [13] S.D. Pena, G. Di Pietro, M. Fuchshuber-Moraes, et al., The genomic ancestry of
genetic similarities and divergences among the four RS population individuals from different geographical regions of Brazil is more uniform than
expected, PLoS One 6 (2) (2011), e17063.
groups solely and along with 89 worldwide-distributed reference pop­ [14] A.R. Marrero, C. Bravi, S. Stuart, et al., Pre- and post-Columbian gene and cultural
ulations were also investigated. Findings from FST-based heatmap and continuity: the case of the Gaucho from southern Brazil, Hum. Hered. 64 (3) (2007)
MDS plotting demonstrated that Admixed-derived and African-derived 160–171.
[15] M.H. Gouveia, V. Borda, T.P. Leal, et al., Origins, admixture dynamics and
Brazilians from RS present the highest levels of admixture and popula­ homogenization of the African gene pool in the Americas, Mol. Biol. Evol. 1;37 (6)
tion stratification, being genetically more similar to other admixed (2020) 1647–1656.
populations (respectively, other Latin American multiethnic populations [16] World Medical Association, World Medical Association Declaration of Helsinki:
ethical principles for medical research involving human subjects, JAMA 310 (20)
and African Americans, for instance), whereas European-derived and (2013) 2191–2194.
Amerindian-derived subpopulations exhibit a more homogeneous ge­ [17] Thermo Fisher Scientific. Ion PGM™ Hi-Q™ OT2 Kit. Revision A.0 (2015).
netic conformation, similar to their respective parental populations. Waltham, MA, USA.
[18] Thermo Fisher Scientific. Ion PGM™ Hi-Q™ Sequencing Kit. Revision C.0 (2015).
Finally, the interethnic admixture landscape revealed by the
Waltham, MA, USA.
model-based clustering of Structure suggested that AFRS has an essen­ [19] Illumina. AmpliSeq for Illumina On-Demand, Custom, and Community Panels.
tially trihybrid heritage with larger African ancestry (62.0%) followed Document # 1000000036408 v08 (2019). San Diego, CA, USA.
by European (24.0%), and a significant Amerindian component [20] Illumina. MiSeq System: Denature and Dilute Libraries Guide. Document #
15039740 v10 (2019). San Diego, CA, USA.
(14.0%); EURS has a predominant European ancestry (94.6%), as well as [21] M. Eduardoff, C. Santos, M. de la Puente, et al., Inter-laboratory evaluation of SNP-
AMRS has a prevailing Amerindian one (88.5%); ADRS has a trihybrid based forensic identification by massively parallel sequencing using the Ion
genetic background composed mainly by European ancestry component PGM™, Forensic Sci. Int. Genet 17 (2015) 110–121.
[22] E. Avila, C.P. Cavalheiro, A.B. Felkl, et al., Brazilian forensic casework analysis
(57%), followed by African (26.8%), and Amerindian (16.2%). through MPS applications: Statistical weight-of-evidence and biological nature of

8
A.B. Felkl et al. Forensic Science International: Genetics 64 (2023) 102838

criminal samples as an influence factor in quality metrics, Forensic Sci. Int. 303 [42] H. Simayijiang, C. Børsting, T. Tvedebrink, N. Morling, Analysis of Uyghur and
(2019), 109938. Kazakh populations using the Precision ID Ancestry Panel, Forensic Sci. Int. Genet
[23] K. Wasik, T. Berisa, J.K. Pickrell, et al., Comparing low-pass sequencing and 43 (2019), 102144.
genotyping for trait mapping in pharmacogenetics, BMC Genom. 20-22 (1) (2021) [43] T. Xie, C. Shen, C. Liu, et al., Ancestry inference and admixture component
197. estimations of Chinese Kazak group based on 165 AIM-SNPs via NGS Platform,
[24] N. Li, M. Stephens, Modeling linkage disequilibrium and identifying recombination J. Hum. Genet (2020).
hotspots using single-nucleotide polymorphism data, Genetics 165 (4) (2003) [44] J.H. Lee, S. Cho, M.Y. Kim, et al., Genetic resolution of applied biosystems™
2213–2233. precision ID Ancestry panel for seven Asian populations, Leg. Med. 34 (2018)
[25] H.E. Lischer, L. Excoffier, PGDSpider: an automated data conversion tool for 41–47.
connecting population genetics and genomics programs, Bioinformatics 28 (2) [45] G. Espregueira Themudo, H. Smidt Mogensen, C. Børsting, N. Morling, Frequencies
(2012) 298–299. of HID-ion ampliseq ancestry panel markers among greenlanders, Forensic Sci. Int.
[26] A. Gouy, M. Zieger, STRAF - A convenient online tool for STR data evaluation in Genet. 24 (2016) 60–64.
forensic genetics, Forensic Sci. Int. Genet 30 (2017) 148–151. [46] R. Santangelo, F. González-Andrade, C. Børsting, A. Torroni, V. Pereira, N. Morling,
[27] L. Excoffier, H.E. Lischer, Arlequin suite ver 3.5: a new series of programs to Analysis of ancestry informative markers in three main ethnic groups from Ecuador
perform population genetics analyses under Linux and Windows, Mol. Ecol. supports a trihybrid origin of Ecuadorians, Forensic Sci. Int. Genet. 31 (2017)
Resour. 10 (3) (2010) 564–567. 29–33.
[28] C.E. Bonferroni, Teoria statistica delle classi e calcolo delle probabilità, Pubbl. Del. [47] D.M. Truelsen, M.S. Farzad, H.S. Mogensen, et al., Typing of two Middle Eastern
Reg. Ist. Super. di Sci. Econ. e Commer. di Firenze 8 (1936) 3–62. populations with the Precision ID Ancestry Panel, Forensic Sci. Int. Genet 6 (2017)
[29] G.R. Abecasis, A. Auton, et al., 1000 Genomes Project Consortium, An integrated e301–e302.
map of genetic variation from 1,092 human genomes, Nature 491 (7422) (2012) [48] IBGE. Sistema IBGE de Recuperação Automática - SIDRA. Tabela 136 - População
56–65. residente por cor ou raça (2010). Instituto Brasileiro de Geografia e Estatística.
[30] O. García, J.A. Ajuriagerra, A. Alday, et al., Frequencies of the precision ID [49] IBGE. Brasil: 500 anos de povoamento (2007). Rio de Janeiro: Instituto Brasileiro
ancestry panel markers in Basques using the Ion Torrent PGM™ platform, Forensic de Geografia e Estatística.
Sci. Int. Genet 31 (2017) e1–e4. [50] C. Phillips, D. Ballard, P. Gill, D.S. Court, A. Carracedo, M.V. Lareu, The
[31] G. He, Z. Wang, M. Wang, et al., Forensic ancestry analysis in two Chinese minority recombination landscape around forensic STRs: Accurate measurement of genetic
populations using massively parallel sequencing of 165 ancestry-informative SNPs, distances between syntenic STR pairs using HapMap high density SNP data,
Electrophoresis 39 (21) (2018) 2732–2742. Forensic Sci. Int Genet. 6 (3) (2012) 354–365.
[32] V. Pereira, H.S. Mogensen, C. Børsting, N. Morling, Evaluation of the Precision ID [51] K.G. Ardlie, L. Kruglyak, M. Seielstad, Patterns of linkage disequilibrium in the
Ancestry Panel for crime case work: a SNP typing assay developed for typing of 165 human genome, Nat. Rev. Genet 3 (4) (2002) 299–309.
ancestral informative markers, Forensic Sci. Int. Genet 28 (2017) 138–145. [52] F. Saloum de Neves Manta, R. Pereira, R. Vianna, et al., Revisiting the genetic
[33] A.J. Pakstis, W.C. Speed, U. Soundararajan, et al., Population relationships based ancestry of Brazilians using autosomal AIM-Indels, PLoS One 8 (9) (2013), e75145.
on 170 ancestry SNPs from the combined Kidd and Seldin panels, Sci. Rep. 9 [53] F.C. Parra, R.C. Amado, J.R. Lambertucci, J. Rocha, C.M. Antunes, S.D. Pena, Color
(2019) 18874. and genomic ancestry in Brazilians, Proc. Natl. Acad. Sci. USA 100 (1) (2003)
[34] I.B.M. Corp. I.B.M. SPSS , 2017. Statistics for Windows. Version 25.0, Released 177–182.
2017. Armonk, NY. [54] S.D. Pena, G. Di Pietro, M. Fuchshuber-Moraes, et al., The genomic ancestry of
[35] J.K. Pritchard, M. Stephens, P. Donnelly, Inference of population structure using individuals from different geographical regions of Brazil is more uniform than
multilocus genotype data, Genetics 155 (2) (2000) 945–959. expected, PLoS One 6 (2) (2011), e17063.
[36] D. Falush, M. Stephens, J.K. Pritchard, Inference of population structure using [55] Y.C. Muniz, L.B. Ferreira, C.T. Mendes-Junior, C.E. Wiezel, A.L. Simões, Genomic
multilocus genotype data: linked loci and correlated allele frequencies, Genetics ancestry in urban Afro-Brazilians, Ann. Hum. Biol. 35 (1) (2008) 104–111.
164 (4) (2003) 1567–1587. [56] C.C. Gontijo, F.M. Mendes, C.A. Santos, et al., Ancestry analysis in rural Brazilian
[37] N.M. Kopelman, J. Mayzel, M. Jakobsson, N.A. Rosenberg, I. Mayrose, Clumpak: a populations of African descent, Forensic Sci. Int. Genet 36 (2018) 160–166.
program for identifying clustering modes and packaging population structure [57] E. Avila, A.B. Felkl, P. Graebin, C.P. Nunes, C.S. Alho, Forensic characterization of
inferences across K, Mol. Ecol. Resour. 15 (5) (2015) 1179–1191. Brazilian regional populations through massive parallel sequencing of 124 SNPs
[38] D.A. Earl, B.M. vonHoldt, Structure Harvester: a website and program for included in HID ion Ampliseq Identity Panel, Forensic Sci. Int. Genet 40 (2019)
visualizing structure output and implementing the Evanno method, Conserv. 74–84.
Genet. Resour. 4 (2) (2011) 359–361. [58] I.B.G.E., 2010. Sistema IBGE de Recuperação Automática - SIDRA. Tabela 136 -
[39] G. Evanno, S. Regnaut, J. Goudet, Detecting the number of clusters of individuals População residente por cor ou raça (2010). Instituto Brasileiro de Geografia e
using the software structure: a simulation study, Mol. Ecol. 14 (8) (2005) Estatística.
2611–2620. [59] J.R. Pimenta, L.W. Zuccherato, A.A. Debes, et al., Color and genomic ancestry in
[40] H. Nakanishi, V. Pereira, C. Børsting, et al., Analysis of mainland Japanese and Brazilians: a study with forensic microsatellites, Hum. Hered. 62 (4) (2006)
Okinawan Japanese populations using the precision ID Ancestry Panel, Forensic 190–195.
Sci. Int. Genet 33 (2018) 106–109. [60] A. Moriot, C. Santos, A. Freire-Aradas, C. Phillips, D. Hall, Inferring biogeographic
[41] Z. Wang, G. He, T. Luo, et al., Massively parallel sequencing of 165 ancestry ancestry with compound markers of slow and fast evolving polymorphisms, Eur. J.
informative SNPs in two Chinese Tibetan-Burmese minority ethnicities, Forensic Hum. Genet 26 (11) (2018) 1697–1707.
Sci. Int Genet 34 (2018) 141–147. [61] A.J. Pakstis, C. Gurkan, M. Dogan, et al., Genetic relationships of European,
Mediterranean, and SW Asian populations using a panel of 55 AISNPs, Eur. J. Hum.
Genet 27 (12) (2019) 1885–1893.

You might also like