Friday, December 21, 2012

Understanding Principal Component 1 and Principal Component 2 In Population Genetics

I've analyzed my genetic data using some of the data tools on the web which make use of Principal Component Analysis and have wondered what the plotted variables (namely PCA-1 and PCA-2) are exactly. For others like myself who may have also wondered, regarding PCA-1 and PCA-2 as they pertain to Principal Component Analysis as used in populations genetics - the video below at [1:07:20-1:07:49] explains the significance of Principal Components 1 and 2 in population genetics:

Commenter - "In your final slide, the upper map was Principle Component 1 which is consistent with the out of Africa Hypothesis, but the lower map - I'm wondering if that's Principal Component 2 ...?"

Evolutionary geneticist Richard Lewontin - "It is."

Commenter - "... and I'm wondering that because there's a line across Africa at the Sahara, there's a line across Eurasia at the Urals, and that would seem to suggest that the primary signal of human genetics is out of Africa and the colonization of the planet, but the next strongest signal is the regional differentiation?"

Lewontin - "Absolutely, you're quite right about that."

So, PCA 1 represents combined genetic elements corresponding to the movement of homo sapiens out of Africa, while PCA 2 represents combined genetic elements corresponding to the differentiation of regional populations of homo sapiens.

