How to match a user with another user based on their taste?












0












$begingroup$


Information available



Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.



User A:
Row | Attribute a | Attribute b | Attribute c
Item 1| 0.593 | 0.7852 | 0.484
Item 2| 0.18 | 0.96 | 0.05
Item 3| 0.423 | 0.886 | 0.156

User B:
Row | Attribute a | Attribute b | Attribute c
Item 7| 0.228 | 0.148 | 0.658
Item 8| 0.785 | 0.33 | 0.887
Item 9| 0.569 | 0.994 | 0.374


User A has a list of items that he/she likes. Same goes with User B... User N. The items in the profiles of different users might or might not be the same but the items describe the User's taste for that particular item.



Goal



What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!










share|improve this question







New contributor




Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    0












    $begingroup$


    Information available



    Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.



    User A:
    Row | Attribute a | Attribute b | Attribute c
    Item 1| 0.593 | 0.7852 | 0.484
    Item 2| 0.18 | 0.96 | 0.05
    Item 3| 0.423 | 0.886 | 0.156

    User B:
    Row | Attribute a | Attribute b | Attribute c
    Item 7| 0.228 | 0.148 | 0.658
    Item 8| 0.785 | 0.33 | 0.887
    Item 9| 0.569 | 0.994 | 0.374


    User A has a list of items that he/she likes. Same goes with User B... User N. The items in the profiles of different users might or might not be the same but the items describe the User's taste for that particular item.



    Goal



    What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!










    share|improve this question







    New contributor




    Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      0












      0








      0





      $begingroup$


      Information available



      Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.



      User A:
      Row | Attribute a | Attribute b | Attribute c
      Item 1| 0.593 | 0.7852 | 0.484
      Item 2| 0.18 | 0.96 | 0.05
      Item 3| 0.423 | 0.886 | 0.156

      User B:
      Row | Attribute a | Attribute b | Attribute c
      Item 7| 0.228 | 0.148 | 0.658
      Item 8| 0.785 | 0.33 | 0.887
      Item 9| 0.569 | 0.994 | 0.374


      User A has a list of items that he/she likes. Same goes with User B... User N. The items in the profiles of different users might or might not be the same but the items describe the User's taste for that particular item.



      Goal



      What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!










      share|improve this question







      New contributor




      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      Information available



      Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.



      User A:
      Row | Attribute a | Attribute b | Attribute c
      Item 1| 0.593 | 0.7852 | 0.484
      Item 2| 0.18 | 0.96 | 0.05
      Item 3| 0.423 | 0.886 | 0.156

      User B:
      Row | Attribute a | Attribute b | Attribute c
      Item 7| 0.228 | 0.148 | 0.658
      Item 8| 0.785 | 0.33 | 0.887
      Item 9| 0.569 | 0.994 | 0.374


      User A has a list of items that he/she likes. Same goes with User B... User N. The items in the profiles of different users might or might not be the same but the items describe the User's taste for that particular item.



      Goal



      What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!







      machine-learning python deep-learning recommender-system






      share|improve this question







      New contributor




      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      Dhaval ThakkarDhaval Thakkar

      135




      135




      New contributor




      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          3 Answers
          3






          active

          oldest

          votes


















          0












          $begingroup$

          Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
          After you have done this you can introduce the User labels and item labels to build the similarity at the User level.



          A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"



          Both ways would assume the attributes for each item is very similar the attributes to the others items. eg



              item  | sweetness   |   acidity   |  bitterness
          orange| 0.593 | 0.7852 | 0.484
          banana| 0.18 | 0.96 | 0.05
          apple | 0.423 | 0.886 | 0.156


          Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.



          Hope this helps!






          share|improve this answer








          New contributor




          Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
            $endgroup$
            – Dhaval Thakkar
            yesterday





















          0












          $begingroup$

          You can perform clustering of your customers based on a distance function.
          Definition might look like this:




          1. First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

          2. Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

          3. Perform the same operation for each item in first customer's basket.

          4. Calculate mean of the minimum distances.

          5. Do the same for the second customer.

          6. Take maximum of means from the first and the second customer.






          share|improve this answer










          New contributor




          Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$





















            0












            $begingroup$

            Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.






            share|improve this answer









            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "557"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });






              Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.










              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46147%2fhow-to-match-a-user-with-another-user-based-on-their-taste%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0












              $begingroup$

              Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
              After you have done this you can introduce the User labels and item labels to build the similarity at the User level.



              A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"



              Both ways would assume the attributes for each item is very similar the attributes to the others items. eg



                  item  | sweetness   |   acidity   |  bitterness
              orange| 0.593 | 0.7852 | 0.484
              banana| 0.18 | 0.96 | 0.05
              apple | 0.423 | 0.886 | 0.156


              Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.



              Hope this helps!






              share|improve this answer








              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$













              • $begingroup$
                the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
                $endgroup$
                – Dhaval Thakkar
                yesterday


















              0












              $begingroup$

              Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
              After you have done this you can introduce the User labels and item labels to build the similarity at the User level.



              A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"



              Both ways would assume the attributes for each item is very similar the attributes to the others items. eg



                  item  | sweetness   |   acidity   |  bitterness
              orange| 0.593 | 0.7852 | 0.484
              banana| 0.18 | 0.96 | 0.05
              apple | 0.423 | 0.886 | 0.156


              Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.



              Hope this helps!






              share|improve this answer








              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$













              • $begingroup$
                the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
                $endgroup$
                – Dhaval Thakkar
                yesterday
















              0












              0








              0





              $begingroup$

              Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
              After you have done this you can introduce the User labels and item labels to build the similarity at the User level.



              A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"



              Both ways would assume the attributes for each item is very similar the attributes to the others items. eg



                  item  | sweetness   |   acidity   |  bitterness
              orange| 0.593 | 0.7852 | 0.484
              banana| 0.18 | 0.96 | 0.05
              apple | 0.423 | 0.886 | 0.156


              Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.



              Hope this helps!






              share|improve this answer








              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$



              Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
              After you have done this you can introduce the User labels and item labels to build the similarity at the User level.



              A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"



              Both ways would assume the attributes for each item is very similar the attributes to the others items. eg



                  item  | sweetness   |   acidity   |  bitterness
              orange| 0.593 | 0.7852 | 0.484
              banana| 0.18 | 0.96 | 0.05
              apple | 0.423 | 0.886 | 0.156


              Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.



              Hope this helps!







              share|improve this answer








              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              share|improve this answer



              share|improve this answer






              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.









              answered yesterday









              LothiliusLothilius

              11




              11




              New contributor




              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.





              New contributor





              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.












              • $begingroup$
                the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
                $endgroup$
                – Dhaval Thakkar
                yesterday




















              • $begingroup$
                the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
                $endgroup$
                – Dhaval Thakkar
                yesterday


















              $begingroup$
              the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
              $endgroup$
              – Dhaval Thakkar
              yesterday






              $begingroup$
              the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
              $endgroup$
              – Dhaval Thakkar
              yesterday













              0












              $begingroup$

              You can perform clustering of your customers based on a distance function.
              Definition might look like this:




              1. First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

              2. Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

              3. Perform the same operation for each item in first customer's basket.

              4. Calculate mean of the minimum distances.

              5. Do the same for the second customer.

              6. Take maximum of means from the first and the second customer.






              share|improve this answer










              New contributor




              Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
              Check out our Code of Conduct.






              $endgroup$


















                0












                $begingroup$

                You can perform clustering of your customers based on a distance function.
                Definition might look like this:




                1. First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

                2. Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

                3. Perform the same operation for each item in first customer's basket.

                4. Calculate mean of the minimum distances.

                5. Do the same for the second customer.

                6. Take maximum of means from the first and the second customer.






                share|improve this answer










                New contributor




                Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                Check out our Code of Conduct.






                $endgroup$
















                  0












                  0








                  0





                  $begingroup$

                  You can perform clustering of your customers based on a distance function.
                  Definition might look like this:




                  1. First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

                  2. Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

                  3. Perform the same operation for each item in first customer's basket.

                  4. Calculate mean of the minimum distances.

                  5. Do the same for the second customer.

                  6. Take maximum of means from the first and the second customer.






                  share|improve this answer










                  New contributor




                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$



                  You can perform clustering of your customers based on a distance function.
                  Definition might look like this:




                  1. First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

                  2. Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

                  3. Perform the same operation for each item in first customer's basket.

                  4. Calculate mean of the minimum distances.

                  5. Do the same for the second customer.

                  6. Take maximum of means from the first and the second customer.







                  share|improve this answer










                  New contributor




                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer








                  edited 2 hours ago









                  naive

                  2366




                  2366






                  New contributor




                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered yesterday









                  Michał KardachMichał Kardach

                  14




                  14




                  New contributor




                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.























                      0












                      $begingroup$

                      Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.






                      share|improve this answer









                      $endgroup$


















                        0












                        $begingroup$

                        Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.






                        share|improve this answer









                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.






                          share|improve this answer









                          $endgroup$



                          Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.







                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 1 hour ago









                          I_Play_With_DataI_Play_With_Data

                          1,009422




                          1,009422






















                              Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.










                              draft saved

                              draft discarded


















                              Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.













                              Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.












                              Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.
















                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46147%2fhow-to-match-a-user-with-another-user-based-on-their-taste%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to label and detect the document text images

                              Vallis Paradisi

                              Tabula Rosettana