Admixture Runs (SGDP Plus)

[ Raw: I ] [ With Maf / LD Pruning: II ] [ With Maf / LD Pruning / Geno: III: K 2 ] [ III: K 11 ] [ III: K 20 ] [ III: K 40 ] [ III: K 50 ] [ III: K 64 ] [ III: K Overclustering ] [ With SGDP Plus: IV: K 10, 20, 40, 50, 64 ] [ Interpretion Methods Results ]

 

III. Auswertungsexperimente

(1) ./make_nnls.R Tarrega Germany_Erfurt_ME Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman Germany_Anderten_Saxon_Medieval (mittelalterliche Proben als Anker)

2 Populationen: Tarrega 44.72, Norman 55,28. Mitfit: 0.568461.

2 Populationen: Tarrega 36.94, Erfurt_ME 63,06. Misfit: 0.568468.

2 Populationen: Erfurt_ME 79.3, Saxon 20.7. Misfit: 0.568498.

2 Populationen: Tarrega 75.98, Saxon 24,02. Misfit: 0.568629.

4 Populationen: Saxon 6.18, Norman 39.53, Erfurt_E 35.12, Tarrega 19.18. Misfit: 0.568284.

5 Populationen (+ French): Tarrega 0.51, Erfurt_ME 0.41, Norman 1.11, Saxon 0.0, French 97.97. Misfit: 0.003069.

5 Populationen (+ English): Tarrega 0.0, Erfurt_ME 0.43, Norman 0.81, Saxon 0.0, English 98.76. Misfit: 0.003305.

5 Populationen (+ Basque): Tarrega 0.0, Erfurt_ME 0.0, Norman 0.69, Saxon 0.83, Basque 98.49. Misfit: 0.004129.

5 Populationen (+ Hungarian): Tarrega 0.0, Erfurt_ME 0.0, Norman 0.0, Saxon 0.0, Hungarian 100.0. Misfit: 0.006451.

5 Populationen (+ Bergamo): Tarrega 0.14, Erfurt_ME 0.0, Norman 0.48, Saxon 0.0, Bergamo 99.38. Misfit: 0.062901.

5 Populationen (+ Sardinian): Tarrega 0.0, Erfurt_ME 0.0, Norman 2.4, Saxon 0.63, Sardinian 96.96. Misfit: 0.296521.

5 Populationen (+ Turkish): Tarrega 4.51, Erfurt_ME 8.41, Norman 10.06, Saxon 0.84, Turkish 76.18. Misfit: 0.549354.

5 Populationen (+ Yemenite_Jew): Tarrega 8.0, Erfurt_ME 14.66 Norman 16.5, Saxon 2.58, Yemenite_Jew 58.26. Misfit: 0.554035.

5 Populationen (+ Mixe): Tarrega 16.77, Erfurt_ME 30.7, Norman 34.56, Saxon 5.4, Mixe 12.57. Misfit: 0.568133.

5 Populationen (+ Han): Tarrega 19.17, Erfurt_ME 35.1, Norman 39.51, Saxon 6.18, Han 0.04. Misfit: 0.568284.

 

(2) ./make_nnls.R French English Yemenite_Jew X Y (moderne Populationen als Anker)

Basque 49.7 ( ~ "Ibero-Phoenician"), Norwegian 26.39, Finnish 9.39, Saxon 0.08 ( = 36.01 "German"), English 8.47, Norman 0.89 ( = 9.36 "White North American"), Samaritan 1.43, Palestinian 0.76, Yemenite_Jew 1.08 ( = 3.27 "Levantine"), Polish 0.22 ( ~ "East European"), Mixe 0.91, Surui 0.22 ( = 1,13 "Mi'kmaq"), Onge 0.44. Misfit: 0.000316.

Basque 48.18, Norwegian 26.54, Mixe 0.92, Finnish 9.74, English 9.74, Samaritan 1.47, Palestinian 0.77, Yemenite_Jew 1.25, Saxon 0.06, Norman 0.9, Surui 0.22. Misfit: 0.000337.

Basque 48.18, Norwegian 26.14, Mixe 0.92, Finnish 10.45, English 9.73, Samaritan 1.48, Palestinian 0.78, Yemenite_Jew 1.32, Saxon 0.06, Norman 0.91, Piapoco 0.01. Misfit: 0.000342.

Basque 57.51, Norwegian 26.49, Mixe 0.92, Finnish 5.71, English 6.36, Samaritan 1.32, Palestinian 0.71, Saxon 0.84, Norman 0.84. Misfit: 0.000351.

Basque 57.6, Norwegian 27.66, Mixe 0.92, Finnish 4.76, English 6.99, Samaritan 1.35, Palestinian 0.72. Misfit: 0.000362.

Jordanian 0.29, Basque 57.32, Norwegian 26.81, Mixe 0.94, Finnish 4.86, English 8.50, Samaritan 1.28. Misfit: 0.000374.

Samaritan 1.28, Hungarian 2.71, Basque 59.44, Norwegian 35.72, Mixe 0.85. Misfit: 0.000391.

Jordanian 2.64, Basque 51.98, Norwegian 26.38, Mixe 0.94, Finnish 5.79, English 4.55, Polish 7.72. Misfit: 0.000401.

Jordanian 3.45, Basque 49.98, Norwegian 26.47, Mixe 0.93, Finnish 11.05, English 8.11. Misfit: 0.000414.

Jordanian 2.98, Hungarian 0.0, Basque 52.23, Norwegian 44.03, Mixe 0.76. Misfit: 0.000461.

Samaritan 1.32, Hungarian 1.46, Basque 60.06, Norwegian 37.16. Misfit: 0.000463.

Samaritan 2.03, Palestinian 0.87, Yemenite_Jew 6.92, Erfurt_ME 0.7, Tarrega 0.0, Norwegian 27.35, Finnish 20.66, Norman 1.1, Saxon 0.0, Polish 5.66, English 33.34, Mixe 0.87, Surui 0.17, Onge 0.32. Misfit: 0.000505.

English 16.95, Hungarian 4.24, Basque 53.67, Polish 23.94, Mixe 1.2. Misfit: 0.000609.

English 17.16, Hungarian 4.29, Basque 54.32, Polish 24.23, Tarrega 0.0. Misfit: 0.000760.

French 0.0, English 38.76, Hungarian 15.3, Basque 45.95, Bergamo 0.0. Misfit: 0.001025.

French 21.9, English 74.62, Yemenite_Jew 2.32, Sardinian 0.0, Mixe 1.16. Misfit: 0.001738.

 

.make_nnls_SGDP_only.R (NNLS für Set nur mit SGDP)

Norwegian 37.87, Basque 15.36, Sardinian 8.76, Jordanian 0.16, Tuscan 4.89, French 16.67, Estonian 16.28. Misfit: 0.000162.

 

(3) Rscript qpadm_ureurope.R

Set mit mittelalterlichen Samples und ausgewählten Kontrollgruppen: 1272 variants and 65 people pass filters and QC. Total genotyping rate is 0.479519. 34556692 variants removed due to missing genotype data (--geno). Qpadm nicht möglich.

Set mit modernen Samples (nur SGDP): qpadm_modern_power.bed has 346 samples and 138004 SNPs. Kontrollgruppen: "Mbuti", "Papuan", "Australian", "Han", "Yoruba", "Yadava", "Bougainville", "Khomani_San", "Eskimo_Sireniki".

targetleftweightsez
[SampleName]Sardinian0.5880.2752.13
[SampleName]Russian0.4120.3251.27
[SampleName]Karitiana0.000008790.05210.000169

targetleftweightsez
[SampleName]Sardinian0.5020.2531.98
[SampleName]Norwegian0.4780.2781.72
[SampleName]Karitiana0.02020.02910.695

targetleftweightsez
[SampleName]Sardinian0.4820.2551.98
[SampleName]Norwegian0.4900.2791.72
[SampleName]Karitiana0.01660.03140.695
[SampleName]Onge0.01120.03450.324

targetleftweightsez
[SampleName]Spanish0.9650.016060.4
[SampleName]Karitiana0.03490.01602.19

targetleftweightsez
[SampleName]Basque0.9690.016060.7
[SampleName]Karitiana0.03130.01601.96

targetleftweightsez
[SampleName]French0.9820.024939.5
[SampleName]Mozabite0.01770.02490.711

targetleftweightsez
[SampleName]French0.9760.040224.3
[SampleName]Mozabite0.02050.02840.724
[SampleName]Mixe0.003160.01790.117

targetleftweightsez
[SampleName]Hungarian0.9970.044622.4
[SampleName]Mozabite0.02780.03180.0874
[SampleName]Mixe0.0004230.01930.0219

P-Wert: Null. No genetic linkage map found. Defining blocks by base pair distance of 2e+06.

 

(4) PCA 25 header tabs

(a) plink1.9 --bfile panel_plus_me --pca 25 header tabs --out my_real_pca_results --allow-no-sex (Set: Medieval Samples plus MySample)

Total genotyping rate is 0.322968. 1141987 variants and 34 people pass filters and QC. Excluding 26919 variants on non-autosomes from relationship matrix calc.

Rscript make_averages.R

(b) plink1.9 --bfile final_dataset --pca 25 header tabs --out my_real_pca_results2 --allow-no-sex (Set: Modern SGDP-Samples plus MySample)

Total genotyping rate is 0.998312. 139193 variants and 346 people pass filters and QC.

Complete file: my_g25_population_averages2.txt

plink1.9 --bfile final_dataset_geno --pca 25 header tabs --out my_real_pca_results4 --allow-no-sex (SGDP-Set with Maf and Geno Filters and LD Pruning)

Total genotyping rate is 0.999319. 137857 variants and 346 people pass filters and QC.

Complete file: my_g25_population_averages4.txt

(c) plink1.9 --bfile final_admixture_denser --pca 25 header tabs --out my_real_pca_results3 --allow-no-sex (Set: Modern SGDP-Samples plus medival samples plus MySample)

Total genotyping rate is 0.958439. 79604 variants and 379 people pass filters and QC.

Complete file: my_g25_population_averages3.txt

Abgleich mit simulierten G25 Koordinaten, skaliert nach Eurogenes-Standard (keine Vergleichsgrundlage):

(d) vahaduo.github.io/vahaduo (Distances Selected Samples SGDP + MA)

Distance to:SampleName
0.07401210Russian
0.08050403Hungarian
0.08381535Estonian
0.08612183Basque
0.09404699Sardinian
0.09547174Spanish
0.10023862Norwegian
0.10322525Tuscan
0.10355739Finnish
0.10370240Bulgarian
0.10439516Orcadian
0.10681300Bergamo
0.10781833Greek
0.10789566Crete
0.11127521Czech
0.11196990Icelandic
0.11356206English
0.11424907Albanian
0.11432787Polish
0.11830513French
0.12274385North_Ossetian
0.12553693Georgian
0.12671697Turkish
0.13056338Saami
0.13344145Iranian
0.14043744Druze
0.14463085Jordanian
0.14525401Palestinian
0.15256101Iraqi_Jew
0.15615037Brahui
0.15637419Mozabite
0.16496939Samaritan
0.16505226Saharawi
0.16929639Kashmiri_Pandit
0.17458328Pathan
0.17827562Yemenite_Jew
0.18046981BedouinB
0.19854582sample_simulated_g25_scaled
0.20493563Brahmin
0.20779100Iraqw
0.23880109Onge
0.24031188Mixe
0.25894328Surui
0.26033090Tarrega
0.26354348Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman
0.28297988Germany_Anderten_Saxon_Medieval
0.35505566Germany_Erfurt_ME

(e) vahaduo.github.io/vahaduo (Distances Complete SGDP+MA Samples)

Distance to:SampleName
0.07401210Russian
0.08050403Hungarian
0.08381535Estonian
0.08612183Basque
0.09404699Sardinian
0.09547174Spanish
0.09931708Maori
0.10023862Norwegian
0.10322525Tuscan
0.10355739Finnish
0.10370240Bulgarian
0.10439516Orcadian
0.10681300Bergamo
0.10731372Lezgin
0.10781833Greek
0.10789566Crete
0.11127521Czech
0.11196990Icelandic
0.11356206English
0.11397229Adygei
0.11424907Albanian
0.11432787Polish
0.11830513French
0.12274385North_Ossetian
0.12553693Georgian
0.12671697Turkish
0.12835549Tajik
0.13056338Saami
0.13131029Abkhasian
0.13344145Iranian
0.13701146Chechen
0.13848339Uygur
0.14043744Druze
0.14045034Burusho
0.14308799Armenian
0.14463085Jordanian
0.14525401Palestinian
0.14597358Hazara
0.15256101Iraqi_Jew
0.15259802Tubalar
0.15520251Tlingit
0.15615037Brahui
0.15637419Mozabite
0.16387500Mansi
0.16496939Samaritan
0.16505226Saharawi
0.16566232Kyrgyz
0.16869217Cree
0.16905383Chukchi
0.16929639Kashmiri_Pandit

(f) vahaduo.github.io/vahaduo (Single Mode Selected SGDP+MA Samples)

Target: SampleName
Distance: 5.5720% / 0.05572012
40.8Russian
18.4Crete
13.6Estonian
9.8Tarrega
8.4Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman
5.0Germany_Anderten_Saxon_Medieval
4.0Germany_Erfurt_ME

(g) vahaduo.github.io/vahaduo (Multi Mode Complete SGDP+MA Samples, without Maori/Miao)

TargetDistanceCreteEstonianSaxon_MedievalErfurt_MEIreland_Kilteasheen_NormanLezginMixtecRussianTarrega
SampleName0.0557138818.013.45.04.08.61.00.240.29.6

(h) vahaduo.github.io/vahaduo (Distances Set of Complete SGDP Samples without MA Samples, final_dataset)

Distance to:SampleName
0.05861064Icelandic
0.06179774English
0.06321238Hungarian
0.07021550Norwegian
0.07043423Orcadian
0.07621494French
0.08323196Polish
0.08602913Czech
0.09331285Finnish
0.09804208Estonian
0.10152774Russian
0.12973190Bulgarian
0.13274871Bergamo
0.13865895Tuscan
0.13951420Spanish
0.14088675Basque
0.15796106Albanian
0.17836277Maori
0.19260491Crete
0.19528783Uygur
0.19996488Greek
0.20520075Hazara
0.20981267Kyrgyz
0.21668757Chukchi
0.21815570Tubalar
[...]
0.24338697Turkish
0.24381916Sindhi
0.24427359Saami
0.24491521Palestinian
0.24858766Lezgin
[...]
0.25732897Iranian
0.25746132Armenian
0.26026554North_Ossetian
0.26052663Mixtec
0.26086147Iraqi_Jew
[...]
0.26600958Sardinian
0.29903512Yemenite_Jew
0.31932877Jordanian
0.37179271Samaritan
0.43864190Saharawi

Set SGDP without MA Samples (final_dataset_geno with Maf and Geno Filters and LD Pruning)

Distance to:SampleName
0.05634010Icelandic
0.06060490English
0.06303751Hungarian
0.06800178Orcadian
0.07342989Norwegian
0.07497899Polish
0.07603280French
0.08548837Czech
0.09662956Finnish
0.09993134Estonian
0.10287890Russian
0.12916700Bulgarian
0.12990495Bergamo
0.13653458Tuscan
0.13819052Spanish

(i) vahaduo.github.io/vahaduo (Single/Multi Mode Runs Set of Complete SGDP Samples without Australian and without MA Samples, final_dataset)

Run #1 (Single Mode)

Target: SampleName
Distance: 2.2918% / 0.02291798
22.0Orcadian
19.8Finnish
19.4Estonian
14.8Norwegian
11.2Sardinian
6.0Basque
4.4BedouinB
2.2Surui
0.2French

Run #2 (Multi Mode, Add Dist. Col. = No)

TargetDistanceBasqueBedouinBEstonianFinnishFrenchNorwegianOrcadianSardinianSurui
SampleName0.022913454.84.419.419.00.815.422.611.62.0

Run #3 (Multi Mode, Add Dist. Col. = 0.25x, Recalculate = No)

TargetDistanceBedouinBEstonianFinnishFrenchHungarianNorwegianOrcadianSardinian
SampleName0.027795233.412.20.824.012.223.822.41.2

Die verschiedenen Monte-Carlo-Simulationen variieren leicht. Run #4 mit final_dataset_geno ohne Australian / Bantu, Multi Mode, Add Dist. Col. = 0.25x, Recalculate = No, Run #5 with Add Dist. Col. = No.

(5) Vergleich NNLS - PCA (Urgroßeltern)

Method4x "German"1x "English-American"1x "Acadian-American"2x "Ashkenazic-Sephardic"Quality
NNLS SGDP+MA K 64NorwegianFinnishSaxonEnglishNormanMixeSuruiBasqueSamaritanPalestinianYemenite_JewPolishOngeMisfit
26.399.390.088.470.890.910.2249.71.430.761.080.220.440.000316
NNLS SGDP K 64
(final_dataset_geno)
NorwegianFrenchBasqueSardinianTuscanJordanianEstonianMisfit
37.8716.6715.368.764.890.1616.280.000162
PCA 25 SGDP+MARussianSaxonNormanMixtecCreteTarregaErfurt_MEEstonianLezginDistance
40.25.08.60.218.09.64.013.41.00.05571388
PCA 25 SGDP
Run #2
(final_dataset)
NorwegianFinnishOrcadianFrenchSuruiSardinianBasqueBedouinBEstonianDistance
15.419.022.60.82.011.64.84.419.40.02291345
PCA 25 SGDP
Run #3
(ADR 0.25, final_dataset)
NorwegianFinnishOrcadianFrenchSardinianHungarianBedouinBEstonianDistance
23.80.822.424.01.212.23.412.20.02779523
PCA 25 SGDP
Run #4
(ADR 0.25, final_dataset_geno)
NorwegianIcelandishOrcadianFrenchSuruiSardinianHungarianBedouinBCzechEstonianDistance
23.417.417.414.00.23.616.62.40.44.60.02865856
PCA 25 SGDP
Run #5
(final_dataset_geno)
NorwegianFinnishIcelandishOrcadianSuruiSardinianHungarianBedouinBEstonianDistance
20.412.05.220.42.413.64.63.218.20.02431329

 

Zurück zum Inhaltsverzeichnis
E-Mail kriswagenseil [at] gmx [point] de