Predicting Intent to do X with a confidence score or intent percentage score?
$begingroup$
I have a data set like:
did_purchase action_1_30d action_2_20d action_2_10d ....
False 10 20 100
True ....etc
Where did_purchase
shows whether the customer purchased or not, and the columns indicate the volume of actions taken before the purchase (or non-purchase) event.
So, for the first row the customer did 10 of action_1 within 30 days of the purchase event, but didn't purchase in the end.
I have been using sklearn's LogisticRegression to predict the did_purchase
false/true, and can get about 89% accuracy, which is nice.
However, I'd like a percentage intent score instead. So it could say user-321 has a 46% chance of purchasing in the next 10 days.
What would be a good algo/approach for this?
logistic-regression
New contributor
$endgroup$
add a comment |
$begingroup$
I have a data set like:
did_purchase action_1_30d action_2_20d action_2_10d ....
False 10 20 100
True ....etc
Where did_purchase
shows whether the customer purchased or not, and the columns indicate the volume of actions taken before the purchase (or non-purchase) event.
So, for the first row the customer did 10 of action_1 within 30 days of the purchase event, but didn't purchase in the end.
I have been using sklearn's LogisticRegression to predict the did_purchase
false/true, and can get about 89% accuracy, which is nice.
However, I'd like a percentage intent score instead. So it could say user-321 has a 46% chance of purchasing in the next 10 days.
What would be a good algo/approach for this?
logistic-regression
New contributor
$endgroup$
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago
add a comment |
$begingroup$
I have a data set like:
did_purchase action_1_30d action_2_20d action_2_10d ....
False 10 20 100
True ....etc
Where did_purchase
shows whether the customer purchased or not, and the columns indicate the volume of actions taken before the purchase (or non-purchase) event.
So, for the first row the customer did 10 of action_1 within 30 days of the purchase event, but didn't purchase in the end.
I have been using sklearn's LogisticRegression to predict the did_purchase
false/true, and can get about 89% accuracy, which is nice.
However, I'd like a percentage intent score instead. So it could say user-321 has a 46% chance of purchasing in the next 10 days.
What would be a good algo/approach for this?
logistic-regression
New contributor
$endgroup$
I have a data set like:
did_purchase action_1_30d action_2_20d action_2_10d ....
False 10 20 100
True ....etc
Where did_purchase
shows whether the customer purchased or not, and the columns indicate the volume of actions taken before the purchase (or non-purchase) event.
So, for the first row the customer did 10 of action_1 within 30 days of the purchase event, but didn't purchase in the end.
I have been using sklearn's LogisticRegression to predict the did_purchase
false/true, and can get about 89% accuracy, which is nice.
However, I'd like a percentage intent score instead. So it could say user-321 has a 46% chance of purchasing in the next 10 days.
What would be a good algo/approach for this?
logistic-regression
logistic-regression
New contributor
New contributor
New contributor
asked 12 hours ago
LittleBobbyTablesLittleBobbyTables
1061
1061
New contributor
New contributor
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago
add a comment |
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You could use the probabilities output by LogisticRegression
s predict_proba
method.
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
LittleBobbyTables is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f45405%2fpredicting-intent-to-do-x-with-a-confidence-score-or-intent-percentage-score%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You could use the probabilities output by LogisticRegression
s predict_proba
method.
New contributor
$endgroup$
add a comment |
$begingroup$
You could use the probabilities output by LogisticRegression
s predict_proba
method.
New contributor
$endgroup$
add a comment |
$begingroup$
You could use the probabilities output by LogisticRegression
s predict_proba
method.
New contributor
$endgroup$
You could use the probabilities output by LogisticRegression
s predict_proba
method.
New contributor
New contributor
answered 12 hours ago
WesWes
1065
1065
New contributor
New contributor
add a comment |
add a comment |
LittleBobbyTables is a new contributor. Be nice, and check out our Code of Conduct.
LittleBobbyTables is a new contributor. Be nice, and check out our Code of Conduct.
LittleBobbyTables is a new contributor. Be nice, and check out our Code of Conduct.
LittleBobbyTables is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f45405%2fpredicting-intent-to-do-x-with-a-confidence-score-or-intent-percentage-score%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You mention 89% accuracy. What is the distribution of class labels? Is there a class imbalance? If so, accuracy may not be the right metric here.
$endgroup$
– Wes
12 hours ago
$begingroup$
Sorry I meant F1 is 0.89. Class labels were imbalanced 1% Yes - 99% No but I SMOTE'd them
$endgroup$
– LittleBobbyTables
12 hours ago