PCA in visual Analytics












0












$begingroup$


I m studying visual analytics and i have a theoretical question about this topic.



My professor introduced this schema in him slide



enter image description here



For connect data to visualisation.
Some topic is very easy to understand and regards the rappresentation of data, plot and other tool, other topic instead regards the "data" in particular the data reduction using algorithm like PCA.



I m not very clear why using PCA over the data for reduction some data that we want rappresent.
Ad example, if i have dataset with all movie(it s just an example) with some attribute ad example



id,title,year,review, userReview ecc...



Where is the correlation of this data?Why i need apply a PCA for reduce big database and lose some of this components that I can visualize in some plot or schema that for me is important
For example could be interesting see all best movie that have high average review, using PCA we just cluster user in different movie (i saw example on Iris database here: https://medium.com/@kyasar.mail/pca-principal-component-analysis-729068e28ec8 )



Anyone can clarify me?










share|improve this question









$endgroup$

















    0












    $begingroup$


    I m studying visual analytics and i have a theoretical question about this topic.



    My professor introduced this schema in him slide



    enter image description here



    For connect data to visualisation.
    Some topic is very easy to understand and regards the rappresentation of data, plot and other tool, other topic instead regards the "data" in particular the data reduction using algorithm like PCA.



    I m not very clear why using PCA over the data for reduction some data that we want rappresent.
    Ad example, if i have dataset with all movie(it s just an example) with some attribute ad example



    id,title,year,review, userReview ecc...



    Where is the correlation of this data?Why i need apply a PCA for reduce big database and lose some of this components that I can visualize in some plot or schema that for me is important
    For example could be interesting see all best movie that have high average review, using PCA we just cluster user in different movie (i saw example on Iris database here: https://medium.com/@kyasar.mail/pca-principal-component-analysis-729068e28ec8 )



    Anyone can clarify me?










    share|improve this question









    $endgroup$















      0












      0








      0





      $begingroup$


      I m studying visual analytics and i have a theoretical question about this topic.



      My professor introduced this schema in him slide



      enter image description here



      For connect data to visualisation.
      Some topic is very easy to understand and regards the rappresentation of data, plot and other tool, other topic instead regards the "data" in particular the data reduction using algorithm like PCA.



      I m not very clear why using PCA over the data for reduction some data that we want rappresent.
      Ad example, if i have dataset with all movie(it s just an example) with some attribute ad example



      id,title,year,review, userReview ecc...



      Where is the correlation of this data?Why i need apply a PCA for reduce big database and lose some of this components that I can visualize in some plot or schema that for me is important
      For example could be interesting see all best movie that have high average review, using PCA we just cluster user in different movie (i saw example on Iris database here: https://medium.com/@kyasar.mail/pca-principal-component-analysis-729068e28ec8 )



      Anyone can clarify me?










      share|improve this question









      $endgroup$




      I m studying visual analytics and i have a theoretical question about this topic.



      My professor introduced this schema in him slide



      enter image description here



      For connect data to visualisation.
      Some topic is very easy to understand and regards the rappresentation of data, plot and other tool, other topic instead regards the "data" in particular the data reduction using algorithm like PCA.



      I m not very clear why using PCA over the data for reduction some data that we want rappresent.
      Ad example, if i have dataset with all movie(it s just an example) with some attribute ad example



      id,title,year,review, userReview ecc...



      Where is the correlation of this data?Why i need apply a PCA for reduce big database and lose some of this components that I can visualize in some plot or schema that for me is important
      For example could be interesting see all best movie that have high average review, using PCA we just cluster user in different movie (i saw example on Iris database here: https://medium.com/@kyasar.mail/pca-principal-component-analysis-729068e28ec8 )



      Anyone can clarify me?







      visualization data pca dimensionality-reduction






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked yesterday









      theantomctheantomc

      142




      142






















          1 Answer
          1






          active

          oldest

          votes


















          0












          $begingroup$

          You have to see things for what they are. First, you have to figure out what PCA does. It's not clustering, but it can be used to realize that there could be clusters in the data.
          PCA is not always the answer. It depends on the data. There are other dimensionality reduction techniques such as autoencoders, non-negative matrix factorization and so on...



          For the movie example, it does not make sense to use PCA. The iris database contains another type of data, where it does make sense. For the movie example, you might want to have a look at singular value decomposition and collaborative filtering.



          One of the reasons for doing PCA could be for clustering. If you have many dimensions, and the intrinsic dimensionality is low you will have the curse of dimensionality. Reducing the dimensionality by PCA and doing clustering on the embeddings solves, in this case, the curse of dimensionality.






          share|improve this answer











          $endgroup$













          • $begingroup$
            I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
            $endgroup$
            – theantomc
            20 hours ago













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46103%2fpca-in-visual-analytics%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0












          $begingroup$

          You have to see things for what they are. First, you have to figure out what PCA does. It's not clustering, but it can be used to realize that there could be clusters in the data.
          PCA is not always the answer. It depends on the data. There are other dimensionality reduction techniques such as autoencoders, non-negative matrix factorization and so on...



          For the movie example, it does not make sense to use PCA. The iris database contains another type of data, where it does make sense. For the movie example, you might want to have a look at singular value decomposition and collaborative filtering.



          One of the reasons for doing PCA could be for clustering. If you have many dimensions, and the intrinsic dimensionality is low you will have the curse of dimensionality. Reducing the dimensionality by PCA and doing clustering on the embeddings solves, in this case, the curse of dimensionality.






          share|improve this answer











          $endgroup$













          • $begingroup$
            I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
            $endgroup$
            – theantomc
            20 hours ago


















          0












          $begingroup$

          You have to see things for what they are. First, you have to figure out what PCA does. It's not clustering, but it can be used to realize that there could be clusters in the data.
          PCA is not always the answer. It depends on the data. There are other dimensionality reduction techniques such as autoencoders, non-negative matrix factorization and so on...



          For the movie example, it does not make sense to use PCA. The iris database contains another type of data, where it does make sense. For the movie example, you might want to have a look at singular value decomposition and collaborative filtering.



          One of the reasons for doing PCA could be for clustering. If you have many dimensions, and the intrinsic dimensionality is low you will have the curse of dimensionality. Reducing the dimensionality by PCA and doing clustering on the embeddings solves, in this case, the curse of dimensionality.






          share|improve this answer











          $endgroup$













          • $begingroup$
            I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
            $endgroup$
            – theantomc
            20 hours ago
















          0












          0








          0





          $begingroup$

          You have to see things for what they are. First, you have to figure out what PCA does. It's not clustering, but it can be used to realize that there could be clusters in the data.
          PCA is not always the answer. It depends on the data. There are other dimensionality reduction techniques such as autoencoders, non-negative matrix factorization and so on...



          For the movie example, it does not make sense to use PCA. The iris database contains another type of data, where it does make sense. For the movie example, you might want to have a look at singular value decomposition and collaborative filtering.



          One of the reasons for doing PCA could be for clustering. If you have many dimensions, and the intrinsic dimensionality is low you will have the curse of dimensionality. Reducing the dimensionality by PCA and doing clustering on the embeddings solves, in this case, the curse of dimensionality.






          share|improve this answer











          $endgroup$



          You have to see things for what they are. First, you have to figure out what PCA does. It's not clustering, but it can be used to realize that there could be clusters in the data.
          PCA is not always the answer. It depends on the data. There are other dimensionality reduction techniques such as autoencoders, non-negative matrix factorization and so on...



          For the movie example, it does not make sense to use PCA. The iris database contains another type of data, where it does make sense. For the movie example, you might want to have a look at singular value decomposition and collaborative filtering.



          One of the reasons for doing PCA could be for clustering. If you have many dimensions, and the intrinsic dimensionality is low you will have the curse of dimensionality. Reducing the dimensionality by PCA and doing clustering on the embeddings solves, in this case, the curse of dimensionality.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited yesterday

























          answered yesterday









          Carl RynegardhCarl Rynegardh

          30119




          30119












          • $begingroup$
            I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
            $endgroup$
            – theantomc
            20 hours ago




















          • $begingroup$
            I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
            $endgroup$
            – theantomc
            20 hours ago


















          $begingroup$
          I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
          $endgroup$
          – theantomc
          20 hours ago






          $begingroup$
          I already do SVD for big data analysis in other course and i had challange with this stuff , all reduction (PCA,SVD, MDS, TSNE , ecc) but in this course is mandatory take a dataset and use a PCA and data visual (in D3). The professor link reduction with visualizzation , but i dont understand how
          $endgroup$
          – theantomc
          20 hours ago




















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46103%2fpca-in-visual-analytics%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to label and detect the document text images

          Vallis Paradisi

          Tabula Rosettana