Machine Learning Dataset: Easy enough for fully connected, but not easy enough for logistic regression












1












$begingroup$


I was wondering if someone could direct me to a dataset for a classification task with the following conditions:




  • Multinomial logistic regression alone does not learn a good classifier

  • A series of fully connected layers is able to learn a good classifier

  • The task is not MNIST


Thank you










share|improve this question







New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 1




    $begingroup$
    There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
    $endgroup$
    – pythinker
    yesterday












  • $begingroup$
    Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
    $endgroup$
    – Vaalizaadeh
    yesterday
















1












$begingroup$


I was wondering if someone could direct me to a dataset for a classification task with the following conditions:




  • Multinomial logistic regression alone does not learn a good classifier

  • A series of fully connected layers is able to learn a good classifier

  • The task is not MNIST


Thank you










share|improve this question







New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$








  • 1




    $begingroup$
    There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
    $endgroup$
    – pythinker
    yesterday












  • $begingroup$
    Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
    $endgroup$
    – Vaalizaadeh
    yesterday














1












1








1





$begingroup$


I was wondering if someone could direct me to a dataset for a classification task with the following conditions:




  • Multinomial logistic regression alone does not learn a good classifier

  • A series of fully connected layers is able to learn a good classifier

  • The task is not MNIST


Thank you










share|improve this question







New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I was wondering if someone could direct me to a dataset for a classification task with the following conditions:




  • Multinomial logistic regression alone does not learn a good classifier

  • A series of fully connected layers is able to learn a good classifier

  • The task is not MNIST


Thank you







machine-learning dataset logistic-regression






share|improve this question







New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked yesterday









msmmsm

1061




1061




New contributor




msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






msm is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








  • 1




    $begingroup$
    There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
    $endgroup$
    – pythinker
    yesterday












  • $begingroup$
    Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
    $endgroup$
    – Vaalizaadeh
    yesterday














  • 1




    $begingroup$
    There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
    $endgroup$
    – pythinker
    yesterday












  • $begingroup$
    Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
    $endgroup$
    – Vaalizaadeh
    yesterday








1




1




$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday






$begingroup$
There are lots of datasets you can try but I would simply suggest XOR problem which logistic regression can not perform well on but fully connected layers can achieve a good performance.
$endgroup$
– pythinker
yesterday














$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday




$begingroup$
Multinomial logistic can learn almost everything. The point is that you yourself have to find the correct polynomials which is not easy in most of the cases.
$endgroup$
– Vaalizaadeh
yesterday










1 Answer
1






active

oldest

votes


















1












$begingroup$

Here's a one-dimensional problem that should be impossible for logistic regression:




  • Generate x uniformly on [0, 10]

  • Let y = sin(2 * pi * x)


If you want a classification problem define y_disc as:




  • 0 if y > 1/sqrt(2)

  • 1 if 1/sqrt(2) > y > -1/sqrt(2)

  • 2 if y < -1/sqrt(2).


This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.



If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.






share|improve this answer








New contributor




Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$














    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "557"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    msm is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49145%2fmachine-learning-dataset-easy-enough-for-fully-connected-but-not-easy-enough-f%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1












    $begingroup$

    Here's a one-dimensional problem that should be impossible for logistic regression:




    • Generate x uniformly on [0, 10]

    • Let y = sin(2 * pi * x)


    If you want a classification problem define y_disc as:




    • 0 if y > 1/sqrt(2)

    • 1 if 1/sqrt(2) > y > -1/sqrt(2)

    • 2 if y < -1/sqrt(2).


    This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.



    If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.






    share|improve this answer








    New contributor




    Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$


















      1












      $begingroup$

      Here's a one-dimensional problem that should be impossible for logistic regression:




      • Generate x uniformly on [0, 10]

      • Let y = sin(2 * pi * x)


      If you want a classification problem define y_disc as:




      • 0 if y > 1/sqrt(2)

      • 1 if 1/sqrt(2) > y > -1/sqrt(2)

      • 2 if y < -1/sqrt(2).


      This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.



      If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.






      share|improve this answer








      New contributor




      Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      $endgroup$
















        1












        1








        1





        $begingroup$

        Here's a one-dimensional problem that should be impossible for logistic regression:




        • Generate x uniformly on [0, 10]

        • Let y = sin(2 * pi * x)


        If you want a classification problem define y_disc as:




        • 0 if y > 1/sqrt(2)

        • 1 if 1/sqrt(2) > y > -1/sqrt(2)

        • 2 if y < -1/sqrt(2).


        This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.



        If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.






        share|improve this answer








        New contributor




        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        $endgroup$



        Here's a one-dimensional problem that should be impossible for logistic regression:




        • Generate x uniformly on [0, 10]

        • Let y = sin(2 * pi * x)


        If you want a classification problem define y_disc as:




        • 0 if y > 1/sqrt(2)

        • 1 if 1/sqrt(2) > y > -1/sqrt(2)

        • 2 if y < -1/sqrt(2).


        This is non-linear, so logistic regression should do poorly. Further, if you decide to put higher powers of x into your logistic regression classifier, I think you'll need a lot of powers to accurately represent the Taylor series near 10.



        If fact, you could try adding a small noise term to y before discretizing - I suspect that will result in your logistic regressor overfitting long before it's able to accurately approximate the behavior near x=10.







        share|improve this answer








        New contributor




        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        share|improve this answer



        share|improve this answer






        New contributor




        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered yesterday









        Harry BravinerHarry Braviner

        1111




        1111




        New contributor




        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        Harry Braviner is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






















            msm is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            msm is a new contributor. Be nice, and check out our Code of Conduct.













            msm is a new contributor. Be nice, and check out our Code of Conduct.












            msm is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49145%2fmachine-learning-dataset-easy-enough-for-fully-connected-but-not-easy-enough-f%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to label and detect the document text images

            Tabula Rosettana

            Aureus (color)