[ Raw: I ] [ With Maf / LD Pruning: II ] [ With Maf / LD Pruning / Geno: III: K 2 ] [ III: K 11 ] [ III: K 20 ] [ III: K 40 ] [ III: K 50 ] [ III: K 64 ] [ III: K Overclustering ] [ With SGDP Plus: IV: K 10, 20, 40, 50, 64 ] [ Interpretion Methods Results ]
III. Auswertungsexperimente
(1) ./make_nnls.R Tarrega Germany_Erfurt_ME Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman Germany_Anderten_Saxon_Medieval (mittelalterliche Proben als Anker)
2 Populationen: Tarrega 44.72, Norman 55,28. Mitfit: 0.568461.
2 Populationen: Tarrega 36.94, Erfurt_ME 63,06. Misfit: 0.568468.
2 Populationen: Erfurt_ME 79.3, Saxon 20.7. Misfit: 0.568498.
2 Populationen: Tarrega 75.98, Saxon 24,02. Misfit: 0.568629.
4 Populationen: Saxon 6.18, Norman 39.53, Erfurt_E 35.12, Tarrega 19.18. Misfit: 0.568284.
5 Populationen (+ French): Tarrega 0.51, Erfurt_ME 0.41, Norman 1.11, Saxon 0.0, French 97.97. Misfit: 0.003069.
5 Populationen (+ English): Tarrega 0.0, Erfurt_ME 0.43, Norman 0.81, Saxon 0.0, English 98.76. Misfit: 0.003305.
5 Populationen (+ Basque): Tarrega 0.0, Erfurt_ME 0.0, Norman 0.69, Saxon 0.83, Basque 98.49. Misfit: 0.004129.
5 Populationen (+ Hungarian): Tarrega 0.0, Erfurt_ME 0.0, Norman 0.0, Saxon 0.0, Hungarian 100.0. Misfit: 0.006451.
5 Populationen (+ Bergamo): Tarrega 0.14, Erfurt_ME 0.0, Norman 0.48, Saxon 0.0, Bergamo 99.38. Misfit: 0.062901.
5 Populationen (+ Sardinian): Tarrega 0.0, Erfurt_ME 0.0, Norman 2.4, Saxon 0.63, Sardinian 96.96. Misfit: 0.296521.
5 Populationen (+ Turkish): Tarrega 4.51, Erfurt_ME 8.41, Norman 10.06, Saxon 0.84, Turkish 76.18. Misfit: 0.549354.
5 Populationen (+ Yemenite_Jew): Tarrega 8.0, Erfurt_ME 14.66 Norman 16.5, Saxon 2.58, Yemenite_Jew 58.26. Misfit: 0.554035.
5 Populationen (+ Mixe): Tarrega 16.77, Erfurt_ME 30.7, Norman 34.56, Saxon 5.4, Mixe 12.57. Misfit: 0.568133.
5 Populationen (+ Han): Tarrega 19.17, Erfurt_ME 35.1, Norman 39.51, Saxon 6.18, Han 0.04. Misfit: 0.568284.
(2) ./make_nnls.R French English Yemenite_Jew X Y (moderne Populationen als Anker)
Basque 49.7 ( ~ "Ibero-Phoenician"), Norwegian 26.39, Finnish 9.39, Saxon 0.08 ( = 36.01 "German"), English 8.47, Norman 0.89 ( = 9.36 "White North American"), Samaritan 1.43, Palestinian 0.76, Yemenite_Jew 1.08 ( = 3.27 "Levantine"), Polish 0.22 ( ~ "East European"), Mixe 0.91, Surui 0.22 ( = 1,13 "Mi'kmaq"), Onge 0.44. Misfit: 0.000316.
Basque 48.18, Norwegian 26.54, Mixe 0.92, Finnish 9.74, English 9.74, Samaritan 1.47, Palestinian 0.77, Yemenite_Jew 1.25, Saxon 0.06, Norman 0.9, Surui 0.22. Misfit: 0.000337.
Basque 48.18, Norwegian 26.14, Mixe 0.92, Finnish 10.45, English 9.73, Samaritan 1.48, Palestinian 0.78, Yemenite_Jew 1.32, Saxon 0.06, Norman 0.91, Piapoco 0.01. Misfit: 0.000342.
Basque 57.51, Norwegian 26.49, Mixe 0.92, Finnish 5.71, English 6.36, Samaritan 1.32, Palestinian 0.71, Saxon 0.84, Norman 0.84. Misfit: 0.000351.
Basque 57.6, Norwegian 27.66, Mixe 0.92, Finnish 4.76, English 6.99, Samaritan 1.35, Palestinian 0.72. Misfit: 0.000362.
Jordanian 0.29, Basque 57.32, Norwegian 26.81, Mixe 0.94, Finnish 4.86, English 8.50, Samaritan 1.28. Misfit: 0.000374.
Samaritan 1.28, Hungarian 2.71, Basque 59.44, Norwegian 35.72, Mixe 0.85. Misfit: 0.000391.
Jordanian 2.64, Basque 51.98, Norwegian 26.38, Mixe 0.94, Finnish 5.79, English 4.55, Polish 7.72. Misfit: 0.000401.
Jordanian 3.45, Basque 49.98, Norwegian 26.47, Mixe 0.93, Finnish 11.05, English 8.11. Misfit: 0.000414.
Jordanian 2.98, Hungarian 0.0, Basque 52.23, Norwegian 44.03, Mixe 0.76. Misfit: 0.000461.
Samaritan 1.32, Hungarian 1.46, Basque 60.06, Norwegian 37.16. Misfit: 0.000463.
Samaritan 2.03, Palestinian 0.87, Yemenite_Jew 6.92, Erfurt_ME 0.7, Tarrega 0.0, Norwegian 27.35, Finnish 20.66, Norman 1.1, Saxon 0.0, Polish 5.66, English 33.34, Mixe 0.87, Surui 0.17, Onge 0.32. Misfit: 0.000505.
English 16.95, Hungarian 4.24, Basque 53.67, Polish 23.94, Mixe 1.2. Misfit: 0.000609.
English 17.16, Hungarian 4.29, Basque 54.32, Polish 24.23, Tarrega 0.0. Misfit: 0.000760.
French 0.0, English 38.76, Hungarian 15.3, Basque 45.95, Bergamo 0.0. Misfit: 0.001025.
French 21.9, English 74.62, Yemenite_Jew 2.32, Sardinian 0.0, Mixe 1.16. Misfit: 0.001738.
.make_nnls_SGDP_only.R (NNLS für Set nur mit SGDP)
Norwegian 37.87, Basque 15.36, Sardinian 8.76, Jordanian 0.16, Tuscan 4.89, French 16.67, Estonian 16.28. Misfit: 0.000162.
(3) Rscript qpadm_ureurope.R
Set mit mittelalterlichen Samples und ausgewählten Kontrollgruppen: 1272 variants and 65 people pass filters and QC. Total genotyping rate is 0.479519. 34556692 variants removed due to missing genotype data (--geno). Qpadm nicht möglich.
Set mit modernen Samples (nur SGDP): qpadm_modern_power.bed has 346 samples and 138004 SNPs. Kontrollgruppen: "Mbuti", "Papuan", "Australian", "Han", "Yoruba", "Yadava", "Bougainville", "Khomani_San", "Eskimo_Sireniki".
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Sardinian | 0.588 | 0.275 | 2.13 |
| [SampleName] | Russian | 0.412 | 0.325 | 1.27 |
| [SampleName] | Karitiana | 0.00000879 | 0.0521 | 0.000169 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Sardinian | 0.502 | 0.253 | 1.98 |
| [SampleName] | Norwegian | 0.478 | 0.278 | 1.72 |
| [SampleName] | Karitiana | 0.0202 | 0.0291 | 0.695 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Sardinian | 0.482 | 0.255 | 1.98 |
| [SampleName] | Norwegian | 0.490 | 0.279 | 1.72 |
| [SampleName] | Karitiana | 0.0166 | 0.0314 | 0.695 |
| [SampleName] | Onge | 0.0112 | 0.0345 | 0.324 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Spanish | 0.965 | 0.0160 | 60.4 |
| [SampleName] | Karitiana | 0.0349 | 0.0160 | 2.19 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Basque | 0.969 | 0.0160 | 60.7 |
| [SampleName] | Karitiana | 0.0313 | 0.0160 | 1.96 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | French | 0.982 | 0.0249 | 39.5 |
| [SampleName] | Mozabite | 0.0177 | 0.0249 | 0.711 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | French | 0.976 | 0.0402 | 24.3 |
| [SampleName] | Mozabite | 0.0205 | 0.0284 | 0.724 |
| [SampleName] | Mixe | 0.00316 | 0.0179 | 0.117 |
| target | left | weight | se | z |
|---|---|---|---|---|
| [SampleName] | Hungarian | 0.997 | 0.0446 | 22.4 |
| [SampleName] | Mozabite | 0.0278 | 0.0318 | 0.0874 |
| [SampleName] | Mixe | 0.000423 | 0.0193 | 0.0219 |
P-Wert: Null. No genetic linkage map found. Defining blocks by base pair distance of 2e+06.
(4) PCA 25 header tabs
(a) plink1.9 --bfile panel_plus_me --pca 25 header tabs --out my_real_pca_results --allow-no-sex (Set: Medieval Samples plus MySample)
Total genotyping rate is 0.322968. 1141987 variants and 34 people pass filters and QC. Excluding 26919 variants on non-autosomes from relationship matrix calc.
Rscript make_averages.R
(b) plink1.9 --bfile final_dataset --pca 25 header tabs --out my_real_pca_results2 --allow-no-sex (Set: Modern SGDP-Samples plus MySample)
Total genotyping rate is 0.998312. 139193 variants and 346 people pass filters and QC.
Complete file: my_g25_population_averages2.txt
plink1.9 --bfile final_dataset_geno --pca 25 header tabs --out my_real_pca_results4 --allow-no-sex (SGDP-Set with Maf and Geno Filters and LD Pruning)
Total genotyping rate is 0.999319. 137857 variants and 346 people pass filters and QC.
Complete file: my_g25_population_averages4.txt
(c) plink1.9 --bfile final_admixture_denser --pca 25 header tabs --out my_real_pca_results3 --allow-no-sex (Set: Modern SGDP-Samples plus medival samples plus MySample)
Total genotyping rate is 0.958439. 79604 variants and 379 people pass filters and QC.
Complete file: my_g25_population_averages3.txt
Abgleich mit simulierten G25 Koordinaten, skaliert nach Eurogenes-Standard (keine Vergleichsgrundlage):
(d) vahaduo.github.io/vahaduo (Distances Selected Samples SGDP + MA)
| Distance to: | SampleName |
|---|---|
| 0.07401210 | Russian |
| 0.08050403 | Hungarian |
| 0.08381535 | Estonian |
| 0.08612183 | Basque |
| 0.09404699 | Sardinian |
| 0.09547174 | Spanish |
| 0.10023862 | Norwegian |
| 0.10322525 | Tuscan |
| 0.10355739 | Finnish |
| 0.10370240 | Bulgarian |
| 0.10439516 | Orcadian |
| 0.10681300 | Bergamo |
| 0.10781833 | Greek |
| 0.10789566 | Crete |
| 0.11127521 | Czech |
| 0.11196990 | Icelandic |
| 0.11356206 | English |
| 0.11424907 | Albanian |
| 0.11432787 | Polish |
| 0.11830513 | French |
| 0.12274385 | North_Ossetian |
| 0.12553693 | Georgian |
| 0.12671697 | Turkish |
| 0.13056338 | Saami |
| 0.13344145 | Iranian |
| 0.14043744 | Druze |
| 0.14463085 | Jordanian |
| 0.14525401 | Palestinian |
| 0.15256101 | Iraqi_Jew |
| 0.15615037 | Brahui |
| 0.15637419 | Mozabite |
| 0.16496939 | Samaritan |
| 0.16505226 | Saharawi |
| 0.16929639 | Kashmiri_Pandit |
| 0.17458328 | Pathan |
| 0.17827562 | Yemenite_Jew |
| 0.18046981 | BedouinB |
| 0.19854582 | sample_simulated_g25_scaled |
| 0.20493563 | Brahmin |
| 0.20779100 | Iraqw |
| 0.23880109 | Onge |
| 0.24031188 | Mixe |
| 0.25894328 | Surui |
| 0.26033090 | Tarrega |
| 0.26354348 | Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman |
| 0.28297988 | Germany_Anderten_Saxon_Medieval |
| 0.35505566 | Germany_Erfurt_ME |
(e) vahaduo.github.io/vahaduo (Distances Complete SGDP+MA Samples)
| Distance to: | SampleName |
|---|---|
| 0.07401210 | Russian |
| 0.08050403 | Hungarian |
| 0.08381535 | Estonian |
| 0.08612183 | Basque |
| 0.09404699 | Sardinian |
| 0.09547174 | Spanish |
| 0.09931708 | Maori |
| 0.10023862 | Norwegian |
| 0.10322525 | Tuscan |
| 0.10355739 | Finnish |
| 0.10370240 | Bulgarian |
| 0.10439516 | Orcadian |
| 0.10681300 | Bergamo |
| 0.10731372 | Lezgin |
| 0.10781833 | Greek |
| 0.10789566 | Crete |
| 0.11127521 | Czech |
| 0.11196990 | Icelandic |
| 0.11356206 | English |
| 0.11397229 | Adygei |
| 0.11424907 | Albanian |
| 0.11432787 | Polish |
| 0.11830513 | French |
| 0.12274385 | North_Ossetian |
| 0.12553693 | Georgian |
| 0.12671697 | Turkish |
| 0.12835549 | Tajik |
| 0.13056338 | Saami |
| 0.13131029 | Abkhasian |
| 0.13344145 | Iranian |
| 0.13701146 | Chechen |
| 0.13848339 | Uygur |
| 0.14043744 | Druze |
| 0.14045034 | Burusho |
| 0.14308799 | Armenian |
| 0.14463085 | Jordanian |
| 0.14525401 | Palestinian |
| 0.14597358 | Hazara |
| 0.15256101 | Iraqi_Jew |
| 0.15259802 | Tubalar |
| 0.15520251 | Tlingit |
| 0.15615037 | Brahui |
| 0.15637419 | Mozabite |
| 0.16387500 | Mansi |
| 0.16496939 | Samaritan |
| 0.16505226 | Saharawi |
| 0.16566232 | Kyrgyz |
| 0.16869217 | Cree |
| 0.16905383 | Chukchi |
| 0.16929639 | Kashmiri_Pandit |
(f) vahaduo.github.io/vahaduo (Single Mode Selected SGDP+MA Samples)
| Target: SampleName Distance: 5.5720% / 0.05572012 | |
|---|---|
| 40.8 | Russian |
| 18.4 | Crete |
| 13.6 | Estonian |
| 9.8 | Tarrega |
| 8.4 | Ireland_Kilteasheen_AngloSaxon_EMedieval_Norman |
| 5.0 | Germany_Anderten_Saxon_Medieval |
| 4.0 | Germany_Erfurt_ME |
(g) vahaduo.github.io/vahaduo (Multi Mode Complete SGDP+MA Samples, without Maori/Miao)
| Target | Distance | Crete | Estonian | Saxon_Medieval | Erfurt_ME | Ireland_Kilteasheen_Norman | Lezgin | Mixtec | Russian | Tarrega |
| SampleName | 0.05571388 | 18.0 | 13.4 | 5.0 | 4.0 | 8.6 | 1.0 | 0.2 | 40.2 | 9.6 |
(h) vahaduo.github.io/vahaduo (Distances Set of Complete SGDP Samples without MA Samples, final_dataset)
| Distance to: | SampleName |
|---|---|
| 0.05861064 | Icelandic |
| 0.06179774 | English |
| 0.06321238 | Hungarian |
| 0.07021550 | Norwegian |
| 0.07043423 | Orcadian |
| 0.07621494 | French |
| 0.08323196 | Polish |
| 0.08602913 | Czech |
| 0.09331285 | Finnish |
| 0.09804208 | Estonian |
| 0.10152774 | Russian |
| 0.12973190 | Bulgarian |
| 0.13274871 | Bergamo |
| 0.13865895 | Tuscan |
| 0.13951420 | Spanish |
| 0.14088675 | Basque |
| 0.15796106 | Albanian |
| 0.17836277 | Maori |
| 0.19260491 | Crete |
| 0.19528783 | Uygur |
| 0.19996488 | Greek |
| 0.20520075 | Hazara |
| 0.20981267 | Kyrgyz |
| 0.21668757 | Chukchi |
| 0.21815570 | Tubalar |
| [...] | |
| 0.24338697 | Turkish |
| 0.24381916 | Sindhi |
| 0.24427359 | Saami |
| 0.24491521 | Palestinian |
| 0.24858766 | Lezgin |
| [...] | |
| 0.25732897 | Iranian |
| 0.25746132 | Armenian |
| 0.26026554 | North_Ossetian |
| 0.26052663 | Mixtec |
| 0.26086147 | Iraqi_Jew |
| [...] | |
| 0.26600958 | Sardinian |
| 0.29903512 | Yemenite_Jew |
| 0.31932877 | Jordanian |
| 0.37179271 | Samaritan |
| 0.43864190 | Saharawi |
Set SGDP without MA Samples (final_dataset_geno with Maf and Geno Filters and LD Pruning)
| Distance to: | SampleName |
|---|---|
| 0.05634010 | Icelandic |
| 0.06060490 | English |
| 0.06303751 | Hungarian |
| 0.06800178 | Orcadian |
| 0.07342989 | Norwegian |
| 0.07497899 | Polish |
| 0.07603280 | French |
| 0.08548837 | Czech |
| 0.09662956 | Finnish |
| 0.09993134 | Estonian |
| 0.10287890 | Russian |
| 0.12916700 | Bulgarian |
| 0.12990495 | Bergamo |
| 0.13653458 | Tuscan |
| 0.13819052 | Spanish |
(i) vahaduo.github.io/vahaduo (Single/Multi Mode Runs Set of Complete SGDP Samples without Australian and without MA Samples, final_dataset)
Run #1 (Single Mode)
| Target: SampleName Distance: 2.2918% / 0.02291798 | |
|---|---|
| 22.0 | Orcadian |
| 19.8 | Finnish |
| 19.4 | Estonian |
| 14.8 | Norwegian |
| 11.2 | Sardinian |
| 6.0 | Basque |
| 4.4 | BedouinB |
| 2.2 | Surui |
| 0.2 | French |
Run #2 (Multi Mode, Add Dist. Col. = No)
| Target | Distance | Basque | BedouinB | Estonian | Finnish | French | Norwegian | Orcadian | Sardinian | Surui |
| SampleName | 0.02291345 | 4.8 | 4.4 | 19.4 | 19.0 | 0.8 | 15.4 | 22.6 | 11.6 | 2.0 |
Run #3 (Multi Mode, Add Dist. Col. = 0.25x, Recalculate = No)
| Target | Distance | BedouinB | Estonian | Finnish | French | Hungarian | Norwegian | Orcadian | Sardinian |
| SampleName | 0.02779523 | 3.4 | 12.2 | 0.8 | 24.0 | 12.2 | 23.8 | 22.4 | 1.2 |
Die verschiedenen Monte-Carlo-Simulationen variieren leicht. Run #4 mit final_dataset_geno ohne Australian / Bantu, Multi Mode, Add Dist. Col. = 0.25x, Recalculate = No, Run #5 with Add Dist. Col. = No.
(5) Vergleich NNLS - PCA (Urgroßeltern)
| Method | 4x "German" | 1x "English-American" | 1x "Acadian-American" | 2x "Ashkenazic-Sephardic" | Quality | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NNLS SGDP+MA K 64 | Norwegian | Finnish | Saxon | English | Norman | Mixe | Surui | Basque | Samaritan | Palestinian | Yemenite_Jew | Polish | Onge | Misfit |
| 26.39 | 9.39 | 0.08 | 8.47 | 0.89 | 0.91 | 0.22 | 49.7 | 1.43 | 0.76 | 1.08 | 0.22 | 0.44 | 0.000316 | |
| NNLS SGDP K 64 (final_dataset_geno) | Norwegian | French | Basque | Sardinian | Tuscan | Jordanian | Estonian | Misfit | ||||||
| 37.87 | 16.67 | 15.36 | 8.76 | 4.89 | 0.16 | 16.28 | 0.000162 | |||||||
| PCA 25 SGDP+MA | Russian | Saxon | Norman | Mixtec | Crete | Tarrega | Erfurt_ME | Estonian | Lezgin | Distance | ||||
| 40.2 | 5.0 | 8.6 | 0.2 | 18.0 | 9.6 | 4.0 | 13.4 | 1.0 | 0.05571388 | |||||
| PCA 25 SGDP Run #2 (final_dataset) | Norwegian | Finnish | Orcadian | French | Surui | Sardinian | Basque | BedouinB | Estonian | Distance | ||||
| 15.4 | 19.0 | 22.6 | 0.8 | 2.0 | 11.6 | 4.8 | 4.4 | 19.4 | 0.02291345 | |||||
| PCA 25 SGDP Run #3 (ADR 0.25, final_dataset) | Norwegian | Finnish | Orcadian | French | Sardinian | Hungarian | BedouinB | Estonian | Distance | |||||
| 23.8 | 0.8 | 22.4 | 24.0 | 1.2 | 12.2 | 3.4 | 12.2 | 0.02779523 | ||||||
| PCA 25 SGDP Run #4 (ADR 0.25, final_dataset_geno) | Norwegian | Icelandish | Orcadian | French | Surui | Sardinian | Hungarian | BedouinB | Czech | Estonian | Distance | |||
| 23.4 | 17.4 | 17.4 | 14.0 | 0.2 | 3.6 | 16.6 | 2.4 | 0.4 | 4.6 | 0.02865856 | ||||
| PCA 25 SGDP Run #5 (final_dataset_geno) | Norwegian | Finnish | Icelandish | Orcadian | Surui | Sardinian | Hungarian | BedouinB | Estonian | Distance | ||||
| 20.4 | 12.0 | 5.2 | 20.4 | 2.4 | 13.6 | 4.6 | 3.2 | 18.2 | 0.02431329 | |||||
Zurück zum Inhaltsverzeichnis
E-Mail kriswagenseil [at] gmx [point] de