How to calculate a weighted Hierarchical clustering in Orange












1












$begingroup$


I am doing my first cluster analysis with Orange (which I recently discovered and looks promising for this iterative and interactive process).



Apparently, there are several methods of creating clusters based on distance algorithm:





  • Single linkage (which computes the distance between the closest elements of the two clusters)


  • Average linkage (which computes the average distance between elements of the two clusters)


  • Complete linkage (which computes the distance between the clusters’ most distant elements)


  • Weighted linkage

  • Ward


Since I have several columns, and some of them are more important than others in terms of defining clusters, it seems to me that using weighted linkage method may be what I am looking for. Unfortunately, I don't know how to do that, since I didn't find a way to assign a weight to each column.



To make things worse, I have only found an explanation about the first three on this Orange's blog post but nothing about Weighted linkage (nor Ward, which may be a recent addition, since it is not even mentioned on widget's help).



Am I on the right path to achieve what I am looking for? Is there any way to make some columns more or less important/definitory when calculating the distances?










share|improve this question







New contributor




ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    1












    $begingroup$


    I am doing my first cluster analysis with Orange (which I recently discovered and looks promising for this iterative and interactive process).



    Apparently, there are several methods of creating clusters based on distance algorithm:





    • Single linkage (which computes the distance between the closest elements of the two clusters)


    • Average linkage (which computes the average distance between elements of the two clusters)


    • Complete linkage (which computes the distance between the clusters’ most distant elements)


    • Weighted linkage

    • Ward


    Since I have several columns, and some of them are more important than others in terms of defining clusters, it seems to me that using weighted linkage method may be what I am looking for. Unfortunately, I don't know how to do that, since I didn't find a way to assign a weight to each column.



    To make things worse, I have only found an explanation about the first three on this Orange's blog post but nothing about Weighted linkage (nor Ward, which may be a recent addition, since it is not even mentioned on widget's help).



    Am I on the right path to achieve what I am looking for? Is there any way to make some columns more or less important/definitory when calculating the distances?










    share|improve this question







    New contributor




    ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      1












      1








      1





      $begingroup$


      I am doing my first cluster analysis with Orange (which I recently discovered and looks promising for this iterative and interactive process).



      Apparently, there are several methods of creating clusters based on distance algorithm:





      • Single linkage (which computes the distance between the closest elements of the two clusters)


      • Average linkage (which computes the average distance between elements of the two clusters)


      • Complete linkage (which computes the distance between the clusters’ most distant elements)


      • Weighted linkage

      • Ward


      Since I have several columns, and some of them are more important than others in terms of defining clusters, it seems to me that using weighted linkage method may be what I am looking for. Unfortunately, I don't know how to do that, since I didn't find a way to assign a weight to each column.



      To make things worse, I have only found an explanation about the first three on this Orange's blog post but nothing about Weighted linkage (nor Ward, which may be a recent addition, since it is not even mentioned on widget's help).



      Am I on the right path to achieve what I am looking for? Is there any way to make some columns more or less important/definitory when calculating the distances?










      share|improve this question







      New contributor




      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I am doing my first cluster analysis with Orange (which I recently discovered and looks promising for this iterative and interactive process).



      Apparently, there are several methods of creating clusters based on distance algorithm:





      • Single linkage (which computes the distance between the closest elements of the two clusters)


      • Average linkage (which computes the average distance between elements of the two clusters)


      • Complete linkage (which computes the distance between the clusters’ most distant elements)


      • Weighted linkage

      • Ward


      Since I have several columns, and some of them are more important than others in terms of defining clusters, it seems to me that using weighted linkage method may be what I am looking for. Unfortunately, I don't know how to do that, since I didn't find a way to assign a weight to each column.



      To make things worse, I have only found an explanation about the first three on this Orange's blog post but nothing about Weighted linkage (nor Ward, which may be a recent addition, since it is not even mentioned on widget's help).



      Am I on the right path to achieve what I am looking for? Is there any way to make some columns more or less important/definitory when calculating the distances?







      clustering orange






      share|improve this question







      New contributor




      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 2 days ago









      ccamaraccamara

      1063




      1063




      New contributor




      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      ccamara is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          1












          $begingroup$

          Weighted linkage probably does not mean you get to specify weights of features (build the distance matrix yourself!)



          Instead this most likely refers to the well-known weighted group average strategy you will find in most textbooks often called WPGMA. There are two different definitions of "average", so this is likely simply the "other" average linkage.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
            $endgroup$
            – ccamara
            yesterday










          • $begingroup$
            I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
            $endgroup$
            – Anony-Mousse
            20 hours ago











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          ccamara is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47222%2fhow-to-calculate-a-weighted-hierarchical-clustering-in-orange%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1












          $begingroup$

          Weighted linkage probably does not mean you get to specify weights of features (build the distance matrix yourself!)



          Instead this most likely refers to the well-known weighted group average strategy you will find in most textbooks often called WPGMA. There are two different definitions of "average", so this is likely simply the "other" average linkage.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
            $endgroup$
            – ccamara
            yesterday










          • $begingroup$
            I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
            $endgroup$
            – Anony-Mousse
            20 hours ago
















          1












          $begingroup$

          Weighted linkage probably does not mean you get to specify weights of features (build the distance matrix yourself!)



          Instead this most likely refers to the well-known weighted group average strategy you will find in most textbooks often called WPGMA. There are two different definitions of "average", so this is likely simply the "other" average linkage.






          share|improve this answer









          $endgroup$













          • $begingroup$
            Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
            $endgroup$
            – ccamara
            yesterday










          • $begingroup$
            I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
            $endgroup$
            – Anony-Mousse
            20 hours ago














          1












          1








          1





          $begingroup$

          Weighted linkage probably does not mean you get to specify weights of features (build the distance matrix yourself!)



          Instead this most likely refers to the well-known weighted group average strategy you will find in most textbooks often called WPGMA. There are two different definitions of "average", so this is likely simply the "other" average linkage.






          share|improve this answer









          $endgroup$



          Weighted linkage probably does not mean you get to specify weights of features (build the distance matrix yourself!)



          Instead this most likely refers to the well-known weighted group average strategy you will find in most textbooks often called WPGMA. There are two different definitions of "average", so this is likely simply the "other" average linkage.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered yesterday









          Anony-MousseAnony-Mousse

          4,965624




          4,965624












          • $begingroup$
            Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
            $endgroup$
            – ccamara
            yesterday










          • $begingroup$
            I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
            $endgroup$
            – Anony-Mousse
            20 hours ago


















          • $begingroup$
            Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
            $endgroup$
            – ccamara
            yesterday










          • $begingroup$
            I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
            $endgroup$
            – Anony-Mousse
            20 hours ago
















          $begingroup$
          Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
          $endgroup$
          – ccamara
          yesterday




          $begingroup$
          Oh, I see I misunderstood. If I am not wrong, from your answer I should build the distance matrix in order to express if a variable is more important than another, but honestly I don't know how should I do that in orange. Do you have any clue about that?
          $endgroup$
          – ccamara
          yesterday












          $begingroup$
          I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
          $endgroup$
          – Anony-Mousse
          20 hours ago




          $begingroup$
          I don't use orange. Isn't it little more than a GUI frontend for sklearn by now?
          $endgroup$
          – Anony-Mousse
          20 hours ago










          ccamara is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          ccamara is a new contributor. Be nice, and check out our Code of Conduct.













          ccamara is a new contributor. Be nice, and check out our Code of Conduct.












          ccamara is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47222%2fhow-to-calculate-a-weighted-hierarchical-clustering-in-orange%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Callistus I

          Tabula Rosettana

          How to label and detect the document text images