How to match a user with another user based on their taste?

Information available

Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.

User A:

Row   | Attribute a | Attribute b | Attribute c

Item 1|    0.593    |    0.7852   |   0.484

Item 2|    0.18     |    0.96     |   0.05

Item 3|    0.423    |    0.886    |   0.156



User B:

Row   | Attribute a | Attribute b | Attribute c

Item 7|    0.228    |    0.148    |   0.658

Item 8|    0.785    |    0.33     |   0.887

Item 9|    0.569    |    0.994    |   0.374

User A has a list of items that he/she likes. Same goes with User B... User N. The items in the profiles of different users might or might not be the same but the items describe the User's taste for that particular item.

Goal

What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!

asked yesterday

Dhaval Thakkar

135

New contributor

add a comment |

Information available

Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.

User A:

Row   | Attribute a | Attribute b | Attribute c

Item 1|    0.593    |    0.7852   |   0.484

Item 2|    0.18     |    0.96     |   0.05

Item 3|    0.423    |    0.886    |   0.156



User B:

Row   | Attribute a | Attribute b | Attribute c

Item 7|    0.228    |    0.148    |   0.658

Item 8|    0.785    |    0.33     |   0.887

Item 9|    0.569    |    0.994    |   0.374

Goal

What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!

asked yesterday

Dhaval Thakkar

135

New contributor

add a comment |

Information available

Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.

User A:

Row   | Attribute a | Attribute b | Attribute c

Item 1|    0.593    |    0.7852   |   0.484

Item 2|    0.18     |    0.96     |   0.05

Item 3|    0.423    |    0.886    |   0.156



User B:

Row   | Attribute a | Attribute b | Attribute c

Item 7|    0.228    |    0.148    |   0.658

Item 8|    0.785    |    0.33     |   0.887

Item 9|    0.569    |    0.994    |   0.374

Goal

What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!

asked yesterday

Dhaval Thakkar

135

New contributor

Information available

Consider that there are N users on a platform. Every user adds items that they like on their profile. These items have static attributes that describe the product.

User A:

Row   | Attribute a | Attribute b | Attribute c

Item 1|    0.593    |    0.7852   |   0.484

Item 2|    0.18     |    0.96     |   0.05

Item 3|    0.423    |    0.886    |   0.156



User B:

Row   | Attribute a | Attribute b | Attribute c

Item 7|    0.228    |    0.148    |   0.658

Item 8|    0.785    |    0.33     |   0.887

Item 9|    0.569    |    0.994    |   0.374

Goal

What I want to do is, match a User with another User if they have a similar taste in picking items. I don't understand how to achieve this. Any help is appreciated!

machine-learning python deep-learning recommender-system

asked yesterday

Dhaval Thakkar

135

New contributor

asked yesterday

Dhaval Thakkar

135

New contributor

asked yesterday

Dhaval Thakkar

135

New contributor

asked yesterday

Dhaval Thakkar

135

asked yesterday

Dhaval Thakkar

135

New contributor

Dhaval Thakkar is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

3 Answers
3

active

oldest

votes

Well you could try unsupervised clustering. You may want to leave out the user and item label to start. Depending on how much data you have and guesses at how many "categories" you might end up with you can use K-means or Mean sift clustering. The idea would be you let the similarities be worked out so that you group the items together and give you the "Categories" and there for the similar items. Then you can use the model for any future.
After you have done this you can introduce the User labels and item labels to build the similarity at the User level.

A next step in exploration, depending on the item and attributes, might be reducing the attributes to the average of each item so that one user has averages of each attribute for all items and then use that data. Then you then averages to cluster in terms of types of "user"

Both ways would assume the attributes for each item is very similar the attributes to the others items. eg

    item  | sweetness   |   acidity   |  bitterness

    orange|    0.593    |    0.7852   |   0.484

    banana|    0.18     |    0.96     |   0.05

    apple |    0.423    |    0.886    |   0.156

Or you can just do direct numerical comparison between users so that you calculate something like statistical entropy between the two across all items per attribute, average for all attributes, and set a range so that if in a certain range they are considered similar or different.

Hope this helps!

answered yesterday

Lothilius

New contributor

$begingroup$
the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
$endgroup$
– Dhaval Thakkar
yesterday

add a comment |

You can perform clustering of your customers based on a distance function.
Definition might look like this:

First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

Perform the same operation for each item in first customer's basket.

Calculate mean of the minimum distances.

Do the same for the second customer.

Take maximum of means from the first and the second customer.

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

add a comment |

Is there a reason why you are not using a content-based recommender system? You can use a recommender to "group" users together and once they are grouped, you can introduce members to each other. I guess I don't understand why you are trying to re-invent the wheel on this one - a recommender can get you to where you want to be.

answered 1 hour ago

I_Play_With_Data

1,009422

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46147%2fhow-to-match-a-user-with-another-user-based-on-their-taste%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Both ways would assume the attributes for each item is very similar the attributes to the others items. eg

    item  | sweetness   |   acidity   |  bitterness

    orange|    0.593    |    0.7852   |   0.484

    banana|    0.18     |    0.96     |   0.05

    apple |    0.423    |    0.886    |   0.156

Hope this helps!

answered yesterday

Lothilius

New contributor

$begingroup$
the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
$endgroup$
– Dhaval Thakkar
yesterday

add a comment |

Both ways would assume the attributes for each item is very similar the attributes to the others items. eg

    item  | sweetness   |   acidity   |  bitterness

    orange|    0.593    |    0.7852   |   0.484

    banana|    0.18     |    0.96     |   0.05

    apple |    0.423    |    0.886    |   0.156

Hope this helps!

answered yesterday

Lothilius

New contributor

$begingroup$
the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
$endgroup$
– Dhaval Thakkar
yesterday

add a comment |

Both ways would assume the attributes for each item is very similar the attributes to the others items. eg

    item  | sweetness   |   acidity   |  bitterness

    orange|    0.593    |    0.7852   |   0.484

    banana|    0.18     |    0.96     |   0.05

    apple |    0.423    |    0.886    |   0.156

Hope this helps!

answered yesterday

Lothilius

New contributor

Both ways would assume the attributes for each item is very similar the attributes to the others items. eg

    item  | sweetness   |   acidity   |  bitterness

    orange|    0.593    |    0.7852   |   0.484

    banana|    0.18     |    0.96     |   0.05

    apple |    0.423    |    0.886    |   0.156

Hope this helps!

answered yesterday

Lothilius

New contributor

answered yesterday

Lothilius

New contributor

answered yesterday

Lothilius

answered yesterday

Lothilius

New contributor

Lothilius is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
$endgroup$
– Dhaval Thakkar
yesterday

add a comment |

$begingroup$
the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities
$endgroup$
– Dhaval Thakkar
yesterday

the idea of using unsupervised learning is great but that would only be useful if I had a dataset of all the items from which users could add them. The problem I have is that these items and their attributes will be given to me by an API, so there is no chance that I can get the dataset of all items and their attributes. Also, I couldn't understand the part you said after model building to introduce user and item labels for similarities

– Dhaval Thakkar
yesterday

add a comment |

You can perform clustering of your customers based on a distance function.
Definition might look like this:

First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

Perform the same operation for each item in first customer's basket.

Calculate mean of the minimum distances.

Do the same for the second customer.

Take maximum of means from the first and the second customer.

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

add a comment |

You can perform clustering of your customers based on a distance function.
Definition might look like this:

First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

Perform the same operation for each item in first customer's basket.

Calculate mean of the minimum distances.

Do the same for the second customer.

Take maximum of means from the first and the second customer.

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

add a comment |

You can perform clustering of your customers based on a distance function.
Definition might look like this:

First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

Perform the same operation for each item in first customer's basket.

Calculate mean of the minimum distances.

Do the same for the second customer.

Take maximum of means from the first and the second customer.

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

You can perform clustering of your customers based on a distance function.
Definition might look like this:

First, calculate euclidean distances between the first item of the first customer's basket and all of the items in the second customer's basket.

Then find out, what is the closest item from second customer's basket (minimum euclidean distance).

Perform the same operation for each item in first customer's basket.

Calculate mean of the minimum distances.

Do the same for the second customer.

Take maximum of means from the first and the second customer.

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

edited 2 hours ago

naive

2366

edited 2 hours ago

naive

2366

edited 2 hours ago

naive

2366

answered yesterday

Michał Kardach

New contributor

answered yesterday

Michał Kardach

answered yesterday

Michał Kardach

New contributor

Michał Kardach is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

answered 1 hour ago

I_Play_With_Data

1,009422

add a comment |

answered 1 hour ago

I_Play_With_Data

1,009422

add a comment |

answered 1 hour ago

I_Play_With_Data

1,009422

answered 1 hour ago

I_Play_With_Data

1,009422

answered 1 hour ago

I_Play_With_Data

1,009422

answered 1 hour ago

I_Play_With_Data

1,009422

answered 1 hour ago

I_Play_With_Data

1,009422

add a comment |

Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Dhaval Thakkar is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Htydjtk