Tuesday, January 28, 2014

A little more about the North Asian admixture in Northeast Europe

In my previous text Admix analyses – absolutes?   I examined rules of admixture analyses and tested corresponding  algorithms.  I proved that the formal admix analysis  based on the commonly used MCMC algorithm gives only relative results and results vary depending on used “genetic edges”..  In my example the Siberian admix commonly shown among Northeast Europeans was depending on the most western samples used in the analysis.  So the Siberian shown in most analyses was the common affinity between NE-Europeans and Siberians calculated from the difference of genetic affinity between NE-Europeans and the most western population.  The result didn't indicate real genetic flow directions.  Because genes are strongly related to the geography the result varies strongly.  Using even purest Siberians, NE-Europeans and Lithuanians to find out the Siberian admix in Northeast Europe gives different result than using same Siberians, NE-Europeans and Sardinians (or French samples).  This was the background.

Concerning this same issue I found an interesting new view from the latest monthly magazine of DnaTribes.   DnaTribes was inspired by the famous  Mal’ta  boy from  ancient Siberia.  Mal’ta boy has been a headline in genetic archaeology because his genes reveal that ancient Siberians where a connection between Europeans and Native Americans without East Asian genetic affinity.  People with similar genetic structure lived in Siberia and moved to Europe and America before the recent Siberians.   It is likely that they mixed later in Siberia with people from East Asia forming recent Siberian populations.   So Mal’ta boy’s genes live in America, Europe (mostly in Nortern Europe), but very likely also in Siberia.   DnaTribes has tested this problematic.  And what did they find?  They found that if they remove local genes in every geographic location and look only for foreign genes then Siberian populations have around 3% of Finnic genes.   It is hard to say the time when these Finnic genes travelled to Siberia, or are they remains of Mal’ta boy’s genes, but this is a fully coherent result with my admix analyses indicating Northeast European genes in Siberia at the level around 3-8%, even after purifying Siberian samples from outliers. 

Link to DnaTribes’ article.

Monday, January 6, 2014

PCA update, now including Scandinavians

I have added 6 new samples from Scandinavia,  including 4 Swedes and 2 Norwegians.  Further analyses and comments are added after I have completed admix analyses.

PCA dimensions 1 and 2:



Full size image available here


And dimensions 1 and 3



Full size image available here

Sunday, January 5, 2014

Admix analyses – absolutes?




Many providers offer us genetic admix analyses telling our ancestry based on our inherited genes.  We have also seen that results vary pretty much.   Some analyses tell that I have Scandinavian admix, some deny it.  Some claim that I am almost fully North Russian, some deny also this.  What is the truth?  I am not trying to find it, but shed light on the analyses themselves.  

Three years ago I spent several days testing an admix tool that uses a method called MCMC (Markov chain Monte Carlo).  It wasn't the same I'll use in following tests, but was also based on MCMC.   In this test I use Structure 2.3.4.   I am not a mathematician and can’t open those formulas behind the method (I had my career in software and telecom business), I only try to keep things understandable for readers and myself.  I am telling what everyone can see after careful tests with well qualified data.  There are also admix software tools using a bit different method.  I have not tested all software tools and can only give general observations about other ones.  A widely used admix software gives stricter results.  It defines the most homogeneous populations and compares all others to these “absolutes”.  The problem arises because most homogeneous populations don’t usually represent ancestry for all other populations and are not historically relevant.  Results can look logical, but are artificial for certain populations. 
  
A week ago I decided to repeat my tests. My goal was to find out how the result varies depending on the used data.   After testing I can say with sure that there is no absolute values and all results depend on what population you decide to use.   These analyses show clear geographic influence between populations,  but results are relative.   All results are based on differences between used samples without any absolutes.   Removing one neighboring population can change the result significantly.   I could write kilometers text about this, maybe give answers more, but I think it is better to put this shortly and go now to the test results.  

What would be better object than the Siberian admix among Northeast Europeans?   It is shown in many analyses being indisputable.  Someones call it Northeast Asian admix.  I am not disputing those results, just showing that conclusions can be debatable.  I use Nganassans for the Siberian population becausse they have proven to be the purest Siberians.  

At first the data needs qualification.   My first analysis show a biggish Northeast European admix among four Nganassans.  They have to be pruned away.


 Populations:

1 -  Han  Chinese
2 -  Nganassans
3 -  Mordva
4 -  North Russians (among them can be two west Russians according my obseravtions)
5 - Chuvashes
6 - Southwestern Finns

After the qualification the first step shows how Northeast Europeans cope with purest Asian references.   It happens that Nganassans show still Northeast European admix.   In the comparison between Siberians and "Uralics" the gene flow seems to be from Europe to Siberia.  Remember what I said about relativity.  The Northeast European admix among Nganassans is 3.9 - 8.4%.   Surprisingly it looks just like the Siberian admix we'll see in following tests.  Only Chuvashes among Europeans show Asian admix.   Green and blue represent the Northeast European gene pool.  To find out more detailed picture of this would need higher k values.



Next images show what happens after adding more European populations.  At first I added Lithuanians and Belorussians.  The result is a reversed gene flow between Northeast Europeans and Nganassans.  Obviously added European references create a new differential cluster, now common with Northeast Europeans and Baltoslavic  groups and this new composition pushes the Northeast European admix among Nganassans (seen above) to the other side of Urals.   Different clusters pop up in different places depending on differences between used populations.

Siberian admixes

Russians 0.4 - 4.4%
Mordva 3 - 5 %
Southwestern Finns 0.8 - 4.3 %







1 -  Han  Chinese
2 -  Nganassans
3 -  Mordva
4 -  North Russians
5 -  Chuvashes
6 -  Southwestern Finns

7 -  Lithuanians
8 -  Belorussians

Adding more Europeans gives again a bit different compostion.  

Siberian admixes 

Russians 2.7 - 6.6 %
Mordva 4.9 - 7.3%
Southewest Finns 3.4 - 6.7 %




1 -  Han  Chinese
2 -  Nganassans
3 -  Mordva
4 -  North Russians
5 -  Chuvashes
6 -  Southwestern Finns
7 -  Lithuanians
8 -  Belorussians
9 -  Hungarians
10- Orcadians
11- CEU
12- French


Triangle images: