Why do we choose principal components based on maximum variance explained?












3












$begingroup$


I've seen many people choose # of principal components for PCA based on maximum variance explained. So my question is do we always have to choose principal components based on maximum variance explained? Is it applicable for all scenarios i.e text count vectors(BoW, tfidf..) where number of dimensions are really high.



Does maximum variance means most information about my data in higher dimension is captured into lower dimension?



Usually I'd plot something like this to see the variance explained.



plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('Principal Components')
plt.ylabel('Variance ratio')
plt.show()


PCA










share|improve this question









$endgroup$

















    3












    $begingroup$


    I've seen many people choose # of principal components for PCA based on maximum variance explained. So my question is do we always have to choose principal components based on maximum variance explained? Is it applicable for all scenarios i.e text count vectors(BoW, tfidf..) where number of dimensions are really high.



    Does maximum variance means most information about my data in higher dimension is captured into lower dimension?



    Usually I'd plot something like this to see the variance explained.



    plt.plot(np.cumsum(pca.explained_variance_ratio_))
    plt.xlabel('Principal Components')
    plt.ylabel('Variance ratio')
    plt.show()


    PCA










    share|improve this question









    $endgroup$















      3












      3








      3


      1



      $begingroup$


      I've seen many people choose # of principal components for PCA based on maximum variance explained. So my question is do we always have to choose principal components based on maximum variance explained? Is it applicable for all scenarios i.e text count vectors(BoW, tfidf..) where number of dimensions are really high.



      Does maximum variance means most information about my data in higher dimension is captured into lower dimension?



      Usually I'd plot something like this to see the variance explained.



      plt.plot(np.cumsum(pca.explained_variance_ratio_))
      plt.xlabel('Principal Components')
      plt.ylabel('Variance ratio')
      plt.show()


      PCA










      share|improve this question









      $endgroup$




      I've seen many people choose # of principal components for PCA based on maximum variance explained. So my question is do we always have to choose principal components based on maximum variance explained? Is it applicable for all scenarios i.e text count vectors(BoW, tfidf..) where number of dimensions are really high.



      Does maximum variance means most information about my data in higher dimension is captured into lower dimension?



      Usually I'd plot something like this to see the variance explained.



      plt.plot(np.cumsum(pca.explained_variance_ratio_))
      plt.xlabel('Principal Components')
      plt.ylabel('Variance ratio')
      plt.show()


      PCA







      machine-learning python scikit-learn pca






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 days ago









      user214user214

      16815




      16815






















          3 Answers
          3






          active

          oldest

          votes


















          4












          $begingroup$


          do we always have to choose principal components based on maximum
          variance explained?




          Yes. "Maximum variance explained" is closely related to the main objective as follows.



          Our main objective is: for a limited budget K dimensions, what information $mbox{a}=(a_1,...,a_K)$ to keep from original data $mbox{x}=(x_1,...,x_D)$ ($D gg K$) in order to be able to reconstruct $mbox{x}$ from $mbox{a}$ as close as possible?



          If we only allow rotation and scaling of original data, i.e. $a_k := mbox{x}.mbox{v}_k$ for unknown set of vectors $V_K={mbox{v}_k|mbox{v}_k in mathbb{R}^D, 1 leq k leq K}$, and define the reconstruction error as
          $$loss(mbox{x},V_K):=left | mbox{x}-underbrace{sum_{k=1}^{K}overbrace{(mbox{x}.mbox{v}_k)}^{a_k}mbox{v}_k}_{hat{mbox{x}}} right |^2,$$
          the solution $V^*_K$ that minimizes this error is PCA. For first dimension, PCA keeps the projection of data on vector $mbox{v}^*_1$ in the direction of largest data variance, namely $a^*_1$. For second dimension, it keeps the projection on vector $mbox{v}^*_2$ in the direction of second largest data variance, namely $a^*_2$, and so on.



          In other words, when we try to find a K-vector set $V_K$ that minimizes $loss(X,V_K)=frac{1}{N}sum_{n=1}^{N}loss(mbox{x}_n,V_K)$, the solution
          $V^*_K$ includes $mbox{v}^*_k$ that is in the direction of $kmbox{-th}$ largest data variance.



          Note that "ratio of variance explained" is a measure from statistics. Using the previous notations, it is defined as:



          $$mbox{R}(X,V_K):=1 - frac{loss(X,V_K)}{Var(X)}$$



          Since variance of original data $Var(X)$ is independent of solution, minimum of $loss(X,V_K)$ is equivalent to maximum of $mbox{R}(X,V_K)$. For example, if $K=2$, then $V^*_2={mbox{v}^*_1, mbox{v}^*_2}$ minimizes $loss(X,V_2)$ and equivalently maximizes $mbox{R}(X,V_2)$. Ideally, if original data $X$ can be perfectly reconstructed from $V_K$, then $R(X, V_K)$ would be $1$.




          Does maximum variance means most information about my data in higher
          dimension is captured into lower dimension?




          Yes. If we agree that "keep as much information as possible" is equivalent to "be able to reconstruct the data as close as possible", then our objective $min_{V_K}loss(X,V_K)$ formalizes "keep as much information as possible", and its solution is "maximum variance".






          share|improve this answer











          $endgroup$





















            1












            $begingroup$

            Principal Component Analysis is commonly used as a technique in Machine Learning as a preprocessing step. It is dimensionality reduction. You can imagine that this might be useful for things like visualization or for reducing the size of your training set. Why we want to maximize the variance is so that you preserve as much information about the original data as possible and only loose a small amount of information.



            In answer to your question, yes, high variance in this case means preserving most of the information captured in the high dimensional data, in a lower dimension. There is a mathematical intuition for this when you are projecting points on to a perpendicular line, could you revert back to the original points?



            On that note - if someone would like to provide the mathematical intuition explicitly I would welcome that answer






            share|improve this answer









            $endgroup$





















              1












              $begingroup$

              In addition to what has been said:




              Why do we choose principal components based on maximum variance explained?




              - Because the variance left by rest of the components is in fact
              the residual you want to minimize when looking for the best representation of your data in less dimensions (the best mean-square linear representation, of course).




              do we always have to choose principal components based on maximum variance explained?




              - Yes, if dimensionality reduction is what you want.



              However, there are applications when the residual components are those who tell the story :-)






              share|improve this answer








              New contributor




              m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$













                Your Answer





                StackExchange.ifUsing("editor", function () {
                return StackExchange.using("mathjaxEditing", function () {
                StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
                StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
                });
                });
                }, "mathjax-editing");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "557"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: false,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: null,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46881%2fwhy-do-we-choose-principal-components-based-on-maximum-variance-explained%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                4












                $begingroup$


                do we always have to choose principal components based on maximum
                variance explained?




                Yes. "Maximum variance explained" is closely related to the main objective as follows.



                Our main objective is: for a limited budget K dimensions, what information $mbox{a}=(a_1,...,a_K)$ to keep from original data $mbox{x}=(x_1,...,x_D)$ ($D gg K$) in order to be able to reconstruct $mbox{x}$ from $mbox{a}$ as close as possible?



                If we only allow rotation and scaling of original data, i.e. $a_k := mbox{x}.mbox{v}_k$ for unknown set of vectors $V_K={mbox{v}_k|mbox{v}_k in mathbb{R}^D, 1 leq k leq K}$, and define the reconstruction error as
                $$loss(mbox{x},V_K):=left | mbox{x}-underbrace{sum_{k=1}^{K}overbrace{(mbox{x}.mbox{v}_k)}^{a_k}mbox{v}_k}_{hat{mbox{x}}} right |^2,$$
                the solution $V^*_K$ that minimizes this error is PCA. For first dimension, PCA keeps the projection of data on vector $mbox{v}^*_1$ in the direction of largest data variance, namely $a^*_1$. For second dimension, it keeps the projection on vector $mbox{v}^*_2$ in the direction of second largest data variance, namely $a^*_2$, and so on.



                In other words, when we try to find a K-vector set $V_K$ that minimizes $loss(X,V_K)=frac{1}{N}sum_{n=1}^{N}loss(mbox{x}_n,V_K)$, the solution
                $V^*_K$ includes $mbox{v}^*_k$ that is in the direction of $kmbox{-th}$ largest data variance.



                Note that "ratio of variance explained" is a measure from statistics. Using the previous notations, it is defined as:



                $$mbox{R}(X,V_K):=1 - frac{loss(X,V_K)}{Var(X)}$$



                Since variance of original data $Var(X)$ is independent of solution, minimum of $loss(X,V_K)$ is equivalent to maximum of $mbox{R}(X,V_K)$. For example, if $K=2$, then $V^*_2={mbox{v}^*_1, mbox{v}^*_2}$ minimizes $loss(X,V_2)$ and equivalently maximizes $mbox{R}(X,V_2)$. Ideally, if original data $X$ can be perfectly reconstructed from $V_K$, then $R(X, V_K)$ would be $1$.




                Does maximum variance means most information about my data in higher
                dimension is captured into lower dimension?




                Yes. If we agree that "keep as much information as possible" is equivalent to "be able to reconstruct the data as close as possible", then our objective $min_{V_K}loss(X,V_K)$ formalizes "keep as much information as possible", and its solution is "maximum variance".






                share|improve this answer











                $endgroup$


















                  4












                  $begingroup$


                  do we always have to choose principal components based on maximum
                  variance explained?




                  Yes. "Maximum variance explained" is closely related to the main objective as follows.



                  Our main objective is: for a limited budget K dimensions, what information $mbox{a}=(a_1,...,a_K)$ to keep from original data $mbox{x}=(x_1,...,x_D)$ ($D gg K$) in order to be able to reconstruct $mbox{x}$ from $mbox{a}$ as close as possible?



                  If we only allow rotation and scaling of original data, i.e. $a_k := mbox{x}.mbox{v}_k$ for unknown set of vectors $V_K={mbox{v}_k|mbox{v}_k in mathbb{R}^D, 1 leq k leq K}$, and define the reconstruction error as
                  $$loss(mbox{x},V_K):=left | mbox{x}-underbrace{sum_{k=1}^{K}overbrace{(mbox{x}.mbox{v}_k)}^{a_k}mbox{v}_k}_{hat{mbox{x}}} right |^2,$$
                  the solution $V^*_K$ that minimizes this error is PCA. For first dimension, PCA keeps the projection of data on vector $mbox{v}^*_1$ in the direction of largest data variance, namely $a^*_1$. For second dimension, it keeps the projection on vector $mbox{v}^*_2$ in the direction of second largest data variance, namely $a^*_2$, and so on.



                  In other words, when we try to find a K-vector set $V_K$ that minimizes $loss(X,V_K)=frac{1}{N}sum_{n=1}^{N}loss(mbox{x}_n,V_K)$, the solution
                  $V^*_K$ includes $mbox{v}^*_k$ that is in the direction of $kmbox{-th}$ largest data variance.



                  Note that "ratio of variance explained" is a measure from statistics. Using the previous notations, it is defined as:



                  $$mbox{R}(X,V_K):=1 - frac{loss(X,V_K)}{Var(X)}$$



                  Since variance of original data $Var(X)$ is independent of solution, minimum of $loss(X,V_K)$ is equivalent to maximum of $mbox{R}(X,V_K)$. For example, if $K=2$, then $V^*_2={mbox{v}^*_1, mbox{v}^*_2}$ minimizes $loss(X,V_2)$ and equivalently maximizes $mbox{R}(X,V_2)$. Ideally, if original data $X$ can be perfectly reconstructed from $V_K$, then $R(X, V_K)$ would be $1$.




                  Does maximum variance means most information about my data in higher
                  dimension is captured into lower dimension?




                  Yes. If we agree that "keep as much information as possible" is equivalent to "be able to reconstruct the data as close as possible", then our objective $min_{V_K}loss(X,V_K)$ formalizes "keep as much information as possible", and its solution is "maximum variance".






                  share|improve this answer











                  $endgroup$
















                    4












                    4








                    4





                    $begingroup$


                    do we always have to choose principal components based on maximum
                    variance explained?




                    Yes. "Maximum variance explained" is closely related to the main objective as follows.



                    Our main objective is: for a limited budget K dimensions, what information $mbox{a}=(a_1,...,a_K)$ to keep from original data $mbox{x}=(x_1,...,x_D)$ ($D gg K$) in order to be able to reconstruct $mbox{x}$ from $mbox{a}$ as close as possible?



                    If we only allow rotation and scaling of original data, i.e. $a_k := mbox{x}.mbox{v}_k$ for unknown set of vectors $V_K={mbox{v}_k|mbox{v}_k in mathbb{R}^D, 1 leq k leq K}$, and define the reconstruction error as
                    $$loss(mbox{x},V_K):=left | mbox{x}-underbrace{sum_{k=1}^{K}overbrace{(mbox{x}.mbox{v}_k)}^{a_k}mbox{v}_k}_{hat{mbox{x}}} right |^2,$$
                    the solution $V^*_K$ that minimizes this error is PCA. For first dimension, PCA keeps the projection of data on vector $mbox{v}^*_1$ in the direction of largest data variance, namely $a^*_1$. For second dimension, it keeps the projection on vector $mbox{v}^*_2$ in the direction of second largest data variance, namely $a^*_2$, and so on.



                    In other words, when we try to find a K-vector set $V_K$ that minimizes $loss(X,V_K)=frac{1}{N}sum_{n=1}^{N}loss(mbox{x}_n,V_K)$, the solution
                    $V^*_K$ includes $mbox{v}^*_k$ that is in the direction of $kmbox{-th}$ largest data variance.



                    Note that "ratio of variance explained" is a measure from statistics. Using the previous notations, it is defined as:



                    $$mbox{R}(X,V_K):=1 - frac{loss(X,V_K)}{Var(X)}$$



                    Since variance of original data $Var(X)$ is independent of solution, minimum of $loss(X,V_K)$ is equivalent to maximum of $mbox{R}(X,V_K)$. For example, if $K=2$, then $V^*_2={mbox{v}^*_1, mbox{v}^*_2}$ minimizes $loss(X,V_2)$ and equivalently maximizes $mbox{R}(X,V_2)$. Ideally, if original data $X$ can be perfectly reconstructed from $V_K$, then $R(X, V_K)$ would be $1$.




                    Does maximum variance means most information about my data in higher
                    dimension is captured into lower dimension?




                    Yes. If we agree that "keep as much information as possible" is equivalent to "be able to reconstruct the data as close as possible", then our objective $min_{V_K}loss(X,V_K)$ formalizes "keep as much information as possible", and its solution is "maximum variance".






                    share|improve this answer











                    $endgroup$




                    do we always have to choose principal components based on maximum
                    variance explained?




                    Yes. "Maximum variance explained" is closely related to the main objective as follows.



                    Our main objective is: for a limited budget K dimensions, what information $mbox{a}=(a_1,...,a_K)$ to keep from original data $mbox{x}=(x_1,...,x_D)$ ($D gg K$) in order to be able to reconstruct $mbox{x}$ from $mbox{a}$ as close as possible?



                    If we only allow rotation and scaling of original data, i.e. $a_k := mbox{x}.mbox{v}_k$ for unknown set of vectors $V_K={mbox{v}_k|mbox{v}_k in mathbb{R}^D, 1 leq k leq K}$, and define the reconstruction error as
                    $$loss(mbox{x},V_K):=left | mbox{x}-underbrace{sum_{k=1}^{K}overbrace{(mbox{x}.mbox{v}_k)}^{a_k}mbox{v}_k}_{hat{mbox{x}}} right |^2,$$
                    the solution $V^*_K$ that minimizes this error is PCA. For first dimension, PCA keeps the projection of data on vector $mbox{v}^*_1$ in the direction of largest data variance, namely $a^*_1$. For second dimension, it keeps the projection on vector $mbox{v}^*_2$ in the direction of second largest data variance, namely $a^*_2$, and so on.



                    In other words, when we try to find a K-vector set $V_K$ that minimizes $loss(X,V_K)=frac{1}{N}sum_{n=1}^{N}loss(mbox{x}_n,V_K)$, the solution
                    $V^*_K$ includes $mbox{v}^*_k$ that is in the direction of $kmbox{-th}$ largest data variance.



                    Note that "ratio of variance explained" is a measure from statistics. Using the previous notations, it is defined as:



                    $$mbox{R}(X,V_K):=1 - frac{loss(X,V_K)}{Var(X)}$$



                    Since variance of original data $Var(X)$ is independent of solution, minimum of $loss(X,V_K)$ is equivalent to maximum of $mbox{R}(X,V_K)$. For example, if $K=2$, then $V^*_2={mbox{v}^*_1, mbox{v}^*_2}$ minimizes $loss(X,V_2)$ and equivalently maximizes $mbox{R}(X,V_2)$. Ideally, if original data $X$ can be perfectly reconstructed from $V_K$, then $R(X, V_K)$ would be $1$.




                    Does maximum variance means most information about my data in higher
                    dimension is captured into lower dimension?




                    Yes. If we agree that "keep as much information as possible" is equivalent to "be able to reconstruct the data as close as possible", then our objective $min_{V_K}loss(X,V_K)$ formalizes "keep as much information as possible", and its solution is "maximum variance".







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited 21 hours ago

























                    answered yesterday









                    EsmailianEsmailian

                    4905




                    4905























                        1












                        $begingroup$

                        Principal Component Analysis is commonly used as a technique in Machine Learning as a preprocessing step. It is dimensionality reduction. You can imagine that this might be useful for things like visualization or for reducing the size of your training set. Why we want to maximize the variance is so that you preserve as much information about the original data as possible and only loose a small amount of information.



                        In answer to your question, yes, high variance in this case means preserving most of the information captured in the high dimensional data, in a lower dimension. There is a mathematical intuition for this when you are projecting points on to a perpendicular line, could you revert back to the original points?



                        On that note - if someone would like to provide the mathematical intuition explicitly I would welcome that answer






                        share|improve this answer









                        $endgroup$


















                          1












                          $begingroup$

                          Principal Component Analysis is commonly used as a technique in Machine Learning as a preprocessing step. It is dimensionality reduction. You can imagine that this might be useful for things like visualization or for reducing the size of your training set. Why we want to maximize the variance is so that you preserve as much information about the original data as possible and only loose a small amount of information.



                          In answer to your question, yes, high variance in this case means preserving most of the information captured in the high dimensional data, in a lower dimension. There is a mathematical intuition for this when you are projecting points on to a perpendicular line, could you revert back to the original points?



                          On that note - if someone would like to provide the mathematical intuition explicitly I would welcome that answer






                          share|improve this answer









                          $endgroup$
















                            1












                            1








                            1





                            $begingroup$

                            Principal Component Analysis is commonly used as a technique in Machine Learning as a preprocessing step. It is dimensionality reduction. You can imagine that this might be useful for things like visualization or for reducing the size of your training set. Why we want to maximize the variance is so that you preserve as much information about the original data as possible and only loose a small amount of information.



                            In answer to your question, yes, high variance in this case means preserving most of the information captured in the high dimensional data, in a lower dimension. There is a mathematical intuition for this when you are projecting points on to a perpendicular line, could you revert back to the original points?



                            On that note - if someone would like to provide the mathematical intuition explicitly I would welcome that answer






                            share|improve this answer









                            $endgroup$



                            Principal Component Analysis is commonly used as a technique in Machine Learning as a preprocessing step. It is dimensionality reduction. You can imagine that this might be useful for things like visualization or for reducing the size of your training set. Why we want to maximize the variance is so that you preserve as much information about the original data as possible and only loose a small amount of information.



                            In answer to your question, yes, high variance in this case means preserving most of the information captured in the high dimensional data, in a lower dimension. There is a mathematical intuition for this when you are projecting points on to a perpendicular line, could you revert back to the original points?



                            On that note - if someone would like to provide the mathematical intuition explicitly I would welcome that answer







                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered 2 days ago









                            EthanEthan

                            508222




                            508222























                                1












                                $begingroup$

                                In addition to what has been said:




                                Why do we choose principal components based on maximum variance explained?




                                - Because the variance left by rest of the components is in fact
                                the residual you want to minimize when looking for the best representation of your data in less dimensions (the best mean-square linear representation, of course).




                                do we always have to choose principal components based on maximum variance explained?




                                - Yes, if dimensionality reduction is what you want.



                                However, there are applications when the residual components are those who tell the story :-)






                                share|improve this answer








                                New contributor




                                m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                Check out our Code of Conduct.






                                $endgroup$


















                                  1












                                  $begingroup$

                                  In addition to what has been said:




                                  Why do we choose principal components based on maximum variance explained?




                                  - Because the variance left by rest of the components is in fact
                                  the residual you want to minimize when looking for the best representation of your data in less dimensions (the best mean-square linear representation, of course).




                                  do we always have to choose principal components based on maximum variance explained?




                                  - Yes, if dimensionality reduction is what you want.



                                  However, there are applications when the residual components are those who tell the story :-)






                                  share|improve this answer








                                  New contributor




                                  m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                  Check out our Code of Conduct.






                                  $endgroup$
















                                    1












                                    1








                                    1





                                    $begingroup$

                                    In addition to what has been said:




                                    Why do we choose principal components based on maximum variance explained?




                                    - Because the variance left by rest of the components is in fact
                                    the residual you want to minimize when looking for the best representation of your data in less dimensions (the best mean-square linear representation, of course).




                                    do we always have to choose principal components based on maximum variance explained?




                                    - Yes, if dimensionality reduction is what you want.



                                    However, there are applications when the residual components are those who tell the story :-)






                                    share|improve this answer








                                    New contributor




                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.






                                    $endgroup$



                                    In addition to what has been said:




                                    Why do we choose principal components based on maximum variance explained?




                                    - Because the variance left by rest of the components is in fact
                                    the residual you want to minimize when looking for the best representation of your data in less dimensions (the best mean-square linear representation, of course).




                                    do we always have to choose principal components based on maximum variance explained?




                                    - Yes, if dimensionality reduction is what you want.



                                    However, there are applications when the residual components are those who tell the story :-)







                                    share|improve this answer








                                    New contributor




                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.









                                    share|improve this answer



                                    share|improve this answer






                                    New contributor




                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.









                                    answered yesterday









                                    m0nzderrm0nzderr

                                    263




                                    263




                                    New contributor




                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.





                                    New contributor





                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.






                                    m0nzderr is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Data Science Stack Exchange!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        Use MathJax to format equations. MathJax reference.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46881%2fwhy-do-we-choose-principal-components-based-on-maximum-variance-explained%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        How to label and detect the document text images

                                        Vallis Paradisi

                                        Tabula Rosettana