Loadings vs eigenvectors in PCA: when to use one or another?

  • In principal component analysis (PCA), we get eigenvectors (unit vectors) and eigenvalues. Now, let us define loadings as $$\text{Loadings} = \text{Eigenvectors} \cdot \sqrt{\text{Eigenvalues}}.$$

    I know that eigenvectors are just directions and loadings (as defined above) also include variance along these directions. But for my better understanding, I would like to know where I should use loadings instead of eigenvectors? An example would be perfect!

    I have generally only seen people using eigenvectors but every once in a while they use loadings (as defined above) and then I am left feeling that I do not really understand the difference.

  • ttnphns

    ttnphns Correct answer

    6 years ago

    In PCA, you split covariance (or correlation) matrix into scale part (eigenvalues) and direction part (eigenvectors). You may then endow eigenvectors with the scale: loadings. So, loadings are thus become comparable by magnitude with the covariances/correlations observed between the variables, - because what had been drawn out from the variables' covariation now returns back - in the form of the covariation between the variables and the principal components. Actually, loadings are the covariances/correlations between the original variables and the unit-scaled components. This answer shows geometrically what loadings are and what are coefficients associating components with variables in PCA or factor analysis.


    1. Help you interpret principal components or factors; Because they are the linear combination weights (coefficients) whereby unit-scaled components or factors define or "load" a variable.

      (Eigenvector is just a coefficient of orthogonal transformation or projection, it is devoid of "load" within its value. "Load" is (information of the amount of) variance, magnitude. PCs are extracted to explain variance of the variables. Eigenvalues are the variances of (= explained by) PCs. When we multiply eigenvector by sq.root of the eivenvalue we "load" the bare coefficient by the amount of variance. By that virtue we make the coefficient to be the measure of association, co-variability.)

    2. Loadings sometimes are "rotated" (e.g. varimax) afterwards to facilitate interpretability (see also);

    3. It is loadings which "restore" the original covariance/correlation matrix (see also this thread discussing nuances of PCA and FA in that respect);

    4. While in PCA you can compute values of components both from eigenvectors and loadings, in factor analysis you compute factor scores out of loadings.

    5. And, above all, loading matrix is informative: its vertical sums of squares are the eigenvalues, components' variances, and its horizontal sums of squares are portions of the variables' variances being "explained" by the components.

    6. Rescaled or standardized loading is the loading divided by the variable's st. deviation; it is the correlation. (If your PCA is correlation-based PCA, loading is equal to the rescaled one, because correlation-based PCA is the PCA on standardized variables.) Rescaled loading squared has the meaning of the contribution of a pr. component into a variable; if it is high (close to 1) the variable is well defined by that component alone.

    An example of computations done in PCA and FA for you to see.

    Eigenvectors are unit-scaled loadings; and they are the coefficients (the cosines) of orthogonal transformation (rotation) of variables into principal components or back. Therefore it is easy to compute the components' values (not standardized) with them. Besides that their usage is limited. Eigenvector value squared has the meaning of the contribution of a variable into a pr. component; if it is high (close to 1) the component is well defined by that variable alone.

    Although eigenvectors and loadings are simply two different ways to normalize coordinates of the same points representing columns (variables) of the data on a biplot, it is not a good idea to mix the two terms. This answer explained why. See also.

    Is it possible that there exist different conventions in different fields here? I stumbled over this question, because in my field (chemometrics) the usual way is to have orthonormal loadings. In other words, the scale/magnitude/$\sqrt{{\rm eigenvalues}}$ goes into the scores, not into the loadings. Loadings equal the inverse = transpose of the eigenvector matrix. I double checked this with both the "Handbook of Chemometrics and Qualimetrics" and the "Comprehensive Chemometics" which I consider the 2 most important reference works for chemometrics.

    Side note: In chemometrics, calculating scores from original data is of huge importance, as lots of predictive models use PCA rotation (!) for pre-processing, so the limited use of loadings is IMHO our main use for PCA.

    @cbeleites, It is not only possible that PCA/FA terminologic conventions may differ in different fields (or in different software or books) - I state they do differ. In psychology and human behaviour "loadings" are usually what I labeled by the name (loadings are very important in those fields because interpretation of the latents is pending, while the scores may be scaled down, standardized, and nobody cares). On the other hand, many `R` users on this site have called PCA's eigenvectors "loadings" which might probably come from the function documentation.

    (cont.) Worst of all is that word "loadings" are being used in other techniques (LDA, canonical correlations, and so on) not exactly in the same meaning as in PCA. So, the word itself is compromised. I agree with @amoeba who supposes it to be dropped altogether and be replaced by statistically precise terms such as "correlations" or "coefficients". On the other hand, "eigenvectors" seem to be confined to svd/eigen decomposition, and some methods of dim. reduction do not perform those at all or in their classic form.

    (cont.) If we look at PCA in a general context of biplot we discover that both eigenvectors and loadings are simply coordinates of the columns of the data table, differing only by the amount of inertia (scale) spread over them.

    +1, this is a nice answer. Perhaps I can add here a link to this thread which expands directly on your point #3. @cbeleites: It's useful to know that in chemometrics the word "loadings" (a) is actually used and (b) means covariance matrix eigenvectors. In machine learning, the word "loadings" is rarely used, so the issue usually does not arise. (By the way, I prefer to call "eigenvectors" *principal axes* or *principal directions*. In my answers here I have been careful to use the word "loadings" only as ttnphns uses it.)

    @ttnphns Thank you for your insightful answer, very helpful. I do have a question about #4 though. If I compute scores using EigenVectors then the covariance matrix of the scores is very easily interpreted by comparing to the variance of original data. But if I calculate scores using loadings, the covariance matrix has very large numbers, almost meaningless. I wonder in what situation does it make sense to calculate scores using loadings.

    You must be mixing up. When you properly compute PC scores with the help of loadings you end up with simply standardized components. You do not compute these scores by the same formula as you do with eigenvectors; rather, you should use formulas described in the link of my #4.

    @ttnphns wikipedia says "the loading vectors are eigenvectors of $X^TX$". Luckily, anyone that feels strongly about that terminology can fix it :)

  • There seems to be a great deal of confusion about loadings, coefficients and eigenvectors. The word loadings comes from Factor Analysis and it refers to coefficients of the regression of the data matrix onto the factors. They are not the coefficients defining the factors. See for example Mardia, Bibby and Kent or other multivariate statistics textbooks.

    In recent years the word loadings has been used to indicate the PCs coefficients. Here it seems that it used to indicate the coefficients multiplied by the sqrt of the eigenvalues of the matrix. These are not quantities commonly used in PCA. The principal components are defined as the sum of the variables weighted with unit norm coefficients. In this way the PCs have norm equal to the corresponding eigenvalue, which in turn is equal to the variance explained by the component.

    It is in Factor Analysis that the factors are required to have unit norm. But FA and PCA are completely different. Rotating the PCs' coefficient is very rarely done because it destroys the optimality of the components.

    In FA the factors are not uniquely defined and can be estimated in different ways. The important quantities are the loadings (the true ones) and the communalities which are used to study the structure of the covariance matrix. PCA or PLS should be used to estimate components.

    This answer, correct in particular aspects (+1), overlooks that both FA and PCA can be seen and are comparable (though are distinct) as prediction of manifest variables by the factors/components (the latter taken unit scaled). Loadings are the coefficients of that prediction. So loadings are used and are valid terms, meaning same thing, both in FA and in PCA fields.

    Also, It is pity that some sources (particularly, R documentation) carelessly call eigenvectorcoefficients "loadings" - they contain no _load_ in them.

    It is just that FA and PCA are estimating a different model. In FA the errors are orthogonal in PCA they are not. I don't see much point in comparing the results, unless one is fishing for a model. Loadings are the columns of the matrix `L` which is used to write the covariance matrix as `S = LL' + C` where `C` is a diagonal matrix. they have nothing to do with the PCs' coefficients.

    `they have nothing to do with the PCs' coefficients` We do compute loadings in PCA like we do it in FA. The models are different but the meaning of loadings is similar in both methods.

  • I am a bit confused by those names, and I searched in the book named "Statistical Methods in the Atmospherical Science", and it gave me a summary of varied Terminology of PCA, here are the screenshots in the book, hope it will help.

    enter image description here

    enter image description here

  • There appears to be some confusion over this matter, so I will provide some observations and a pointer to where an excellent answer can be found in the literature.

    Firstly, PCA and Factor Analysis (FA) are related. In general, principal components are orthogonal by definition whereas factors - the analogous entity in FA - are not. Simply put, principal components span the factor space in an arbitrary but not necessarily useful way due to their being derived from pure eigenanalysis of the data. Factors on the other hand represent real-world entities which are only orthogonal (i.e. uncorrelated or independent) by coincidence.

    Say we take s observations from each of l subjects. These can be arranged into a data matrix D having s rows and l columns. D can be decomposed into a score matrix S and a loading matrix L such that D = SL. S will have s rows, and L will have l columns, the second dimension of each being the number of factors n. The purpose of factor analysis is to decompose D in such a way as to reveal the underlying scores and factors. The loadings in L tell us the proportion of each score which make up the observations in D.

    In PCA, L has the eigenvectors of the correlation or covariance matrix of D as its columns. These are conventionally arranged in descending order of the corresponding eigenvalues. The value of n - i.e. the number of significant principal components to retain in the analysis, and hence the number of rows of L - is typically determined through the use of a scree plot of the eigenvalues or one of numerous other methods to be found in the literature. The columns of S in PCA form the n abstract principal components themselves. The value of n is the underlying dimensionality of the data set.

    The object of factor analysis is to transform the abstract components into meaningful factors through the use of a transformation matrix T such that D = STT-1L. (ST) is the transformed score matrix, and (T-1L) is the transformed loading matrix.

    The above explanation roughly follows the notation of Edmund R. Malinowski from his excellent Factor Analysis in Chemistry. I highly recommend the opening chapters as an introduction to the subject.

    This answer seems to have several problems. First, check your formulas, please, they are not correct. Second, you are trying to discuss differences between FA and PCA. We have a separate long thread on CV for that, while the current thread is about loadings vs eigenvectors, so the answer is misplaced. Third, your picture of FA is distorted, especially in phrases such as "the purpose of FA is to decompose D" or "the object of FA is to transform the abstract components into meaningful factors".

    I consider the material I have posted to be relevant to the discussion in this thread, and it offers one explanation of the relationship between loadings and eigenvectors.

    My research on the subject is summarised in this paper: http://onlinelibrary.wiley.com/doi/10.1002/sia.740231303/full

    OK, maybe your account is a special still valid one - I can't say w/o reading the sources you offer. Yet, I'd remark that the "relationship" between loadings and eigenvectors in PCA is all in its formula placed in the question; so there is hardly anything to "explain" (explained should be the different utility of them). Another thing to remark is that the Q is primarily about PCA, not FA. And, in the end, not every FA method deals with eigenvectors at all, while it necessarily deals with loadings.

    Apologies, I don't think there is a publicly available version of my paper, although you can get access through Deepdyve.com with a two-week trial. The first chapter of Malinowski's book is available from the link above. This covers the basics without mentioning eigenanalysis. I must admit that I was unaware that factor analysis could be done without eigenanalysis, as the variant I have used - target factor analysis - does.

    Brief summary of methods of FA on this site. They all do some sort of eigenanalysis, but not all do that _of correlation/covariance matrix_, like PCA does it.

License under CC-BY-SA with attribution

Content dated before 6/26/2020 9:53 AM