Machine Learning Dataset: Easy enough for fully connected, but not easy enough for logistic regression
$begingroup$
I was wondering if someone could direct me to a dataset for a classification task with the following conditions:
- Multinomial logistic regression alone does not learn a good classifier
- A series of fully connected layers is able to learn a good classifier
- The task is not MNIST
Thank you
machine-learning dataset logistic-regression
New contributor
$endgroup$
add a comment |
$begingroup$
I was wondering if someone could direct me to a dataset for a classification task with the following conditions:
- Multinomial logistic regression alone does not learn a good classifier
- A series of fully connected layers is able to learn a good classifier
- The task is not MNIST
Thank you
machine-learning dataset logistic-regression
New contributor
$endgroup$
1
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday
add a comment |
$begingroup$
I was wondering if someone could direct me to a dataset for a classification task with the following conditions:
- Multinomial logistic regression alone does not learn a good classifier
- A series of fully connected layers is able to learn a good classifier
- The task is not MNIST
Thank you
machine-learning dataset logistic-regression
New contributor
$endgroup$
I was wondering if someone could direct me to a dataset for a classification task with the following conditions:
- Multinomial logistic regression alone does not learn a good classifier
- A series of fully connected layers is able to learn a good classifier
- The task is not MNIST
Thank you
machine-learning dataset logistic-regression
machine-learning dataset logistic-regression
New contributor
New contributor
New contributor
asked yesterday
msmmsm
1061
1061
New contributor
New contributor
1
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday
add a comment |
1
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday
1
1
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Here's a one-dimensional problem that should be impossible for logistic regression:
- Generate x uniformly on [0, 10]
- Let y = sin(2 * pi * x)
If you want a classification problem define y_disc as:
- 0 if y > 1/sqrt(2)
- 1 if 1/sqrt(2) > y > -1/sqrt(2)
- 2 if y < -1/sqrt(2).
This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.
If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.
New contributor
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
msm is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49145%2fmachine-learning-dataset-easy-enough-for-fully-connected-but-not-easy-enough-f%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Here's a one-dimensional problem that should be impossible for logistic regression:
- Generate x uniformly on [0, 10]
- Let y = sin(2 * pi * x)
If you want a classification problem define y_disc as:
- 0 if y > 1/sqrt(2)
- 1 if 1/sqrt(2) > y > -1/sqrt(2)
- 2 if y < -1/sqrt(2).
This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.
If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.
New contributor
$endgroup$
add a comment |
$begingroup$
Here's a one-dimensional problem that should be impossible for logistic regression:
- Generate x uniformly on [0, 10]
- Let y = sin(2 * pi * x)
If you want a classification problem define y_disc as:
- 0 if y > 1/sqrt(2)
- 1 if 1/sqrt(2) > y > -1/sqrt(2)
- 2 if y < -1/sqrt(2).
This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.
If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.
New contributor
$endgroup$
add a comment |
$begingroup$
Here's a one-dimensional problem that should be impossible for logistic regression:
- Generate x uniformly on [0, 10]
- Let y = sin(2 * pi * x)
If you want a classification problem define y_disc as:
- 0 if y > 1/sqrt(2)
- 1 if 1/sqrt(2) > y > -1/sqrt(2)
- 2 if y < -1/sqrt(2).
This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.
If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.
New contributor
$endgroup$
Here's a one-dimensional problem that should be impossible for logistic regression:
- Generate x uniformly on [0, 10]
- Let y = sin(2 * pi * x)
If you want a classification problem define y_disc as:
- 0 if y > 1/sqrt(2)
- 1 if 1/sqrt(2) > y > -1/sqrt(2)
- 2 if y < -1/sqrt(2).
This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.
If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.
New contributor
New contributor
answered yesterday
Harry BravinerHarry Braviner
1111
1111
New contributor
New contributor
add a comment |
add a comment |
msm is a new contributor. Be nice, and check out our Code of Conduct.
msm is a new contributor. Be nice, and check out our Code of Conduct.
msm is a new contributor. Be nice, and check out our Code of Conduct.
msm is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49145%2fmachine-learning-dataset-easy-enough-for-fully-connected-but-not-easy-enough-f%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday
$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday