How to use MLP mistakes in a certain class to improve it's precision in that particular class?












2












$begingroup$


Suppose we are developing an app which is supposed to predict a dog's breed by it's picture. We trained a classifier (in my case an MLP) using some dataset and shipped the app to users. Now suppose some user comes and takes a picture of a friend's dog and the app tells her there is 90% chance that this dog is an X. The user knows that this is not true, but she doesn't know what is the dog's breed (if she knew, why would she use our app in the first place?). So we get a feedback which tells us "this is a picture of a dog which is not an X". This sample could be a sample of some other class or a new class or not a dog at all.



I'm looking for a way to use this feedback, to improve the precision of my MLP in class X without touching other classes.










share|improve this question







New contributor




Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    2












    $begingroup$


    Suppose we are developing an app which is supposed to predict a dog's breed by it's picture. We trained a classifier (in my case an MLP) using some dataset and shipped the app to users. Now suppose some user comes and takes a picture of a friend's dog and the app tells her there is 90% chance that this dog is an X. The user knows that this is not true, but she doesn't know what is the dog's breed (if she knew, why would she use our app in the first place?). So we get a feedback which tells us "this is a picture of a dog which is not an X". This sample could be a sample of some other class or a new class or not a dog at all.



    I'm looking for a way to use this feedback, to improve the precision of my MLP in class X without touching other classes.










    share|improve this question







    New contributor




    Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      2












      2








      2


      1



      $begingroup$


      Suppose we are developing an app which is supposed to predict a dog's breed by it's picture. We trained a classifier (in my case an MLP) using some dataset and shipped the app to users. Now suppose some user comes and takes a picture of a friend's dog and the app tells her there is 90% chance that this dog is an X. The user knows that this is not true, but she doesn't know what is the dog's breed (if she knew, why would she use our app in the first place?). So we get a feedback which tells us "this is a picture of a dog which is not an X". This sample could be a sample of some other class or a new class or not a dog at all.



      I'm looking for a way to use this feedback, to improve the precision of my MLP in class X without touching other classes.










      share|improve this question







      New contributor




      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      Suppose we are developing an app which is supposed to predict a dog's breed by it's picture. We trained a classifier (in my case an MLP) using some dataset and shipped the app to users. Now suppose some user comes and takes a picture of a friend's dog and the app tells her there is 90% chance that this dog is an X. The user knows that this is not true, but she doesn't know what is the dog's breed (if she knew, why would she use our app in the first place?). So we get a feedback which tells us "this is a picture of a dog which is not an X". This sample could be a sample of some other class or a new class or not a dog at all.



      I'm looking for a way to use this feedback, to improve the precision of my MLP in class X without touching other classes.







      mlp






      share|improve this question







      New contributor




      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 2 days ago









      MehrabanMehraban

      1134




      1134




      New contributor




      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Mehraban is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes


















          1












          $begingroup$

          This can be accomplished by a modification to multi-class cross-entropy.



          We are faced with two types of supervision. First type is "data $i$ belongs to class $k$" denoted by $y_{ik}=1$, and second type is "data $i$ does not belong to class $k$" denoted by $bar{y}_{ik}=1$. For example, for 3 classes, $y_i=(1, 0, 0)$ denotes that point $i$ belongs to class $1$, and $bar{y}_{i}=(0, 0, 1)$ denotes that point $i$ does not belong to class $3$. Let $y'_{ik} in [0, 1]$ denote the model prediction. The original cross-entropy for $K$ classes is:



          $$H_y(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})$$.



          This objective assigns loss $-log(y'_{ik})$ to $y_{ik} = 1$ to encourage the model to output $y'_{ik} rightarrow 1$ leading to $-log(y'_{ik})rightarrow 0$.



          On the other hand, for the second supervision $bar{y}_{ik}=1$, we want to encourage the model to output $y'_{ik} rightarrow 0$. Therefore, loss $-log(1- y'_{ik})$ can be used to have $-log(1- y'_{ik})rightarrow 0$.



          Accordingly, second supervision can be combined with first one as follows:



          $$H_{(y,bar{y})}(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})+bar{y}_{ik}log(1-y'_{ik})$$



          Note that supervision "data $i$ does not belong to classes $1$ and $2$" is also supported. For example, $bar{y}_{i}=(1, 1, 0,...)$ activates both $-log(1 - y'_{i1})$ and $-log(1 - y'_{i2})$ to encourage the model to output less probabilities for classes $1$ and $2$, i.e. $y'_{i1} rightarrow 0$, and $y'_{i2} rightarrow 0$.






          share|improve this answer










          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            What is $y'$? Did you mix $y'$ and $bar y$?
            $endgroup$
            – Mehraban
            yesterday












          • $begingroup$
            It denotes the model prediction. No they are not mixed.
            $endgroup$
            – Esmailian
            yesterday











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          Mehraban is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46698%2fhow-to-use-mlp-mistakes-in-a-certain-class-to-improve-its-precision-in-that-par%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1












          $begingroup$

          This can be accomplished by a modification to multi-class cross-entropy.



          We are faced with two types of supervision. First type is "data $i$ belongs to class $k$" denoted by $y_{ik}=1$, and second type is "data $i$ does not belong to class $k$" denoted by $bar{y}_{ik}=1$. For example, for 3 classes, $y_i=(1, 0, 0)$ denotes that point $i$ belongs to class $1$, and $bar{y}_{i}=(0, 0, 1)$ denotes that point $i$ does not belong to class $3$. Let $y'_{ik} in [0, 1]$ denote the model prediction. The original cross-entropy for $K$ classes is:



          $$H_y(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})$$.



          This objective assigns loss $-log(y'_{ik})$ to $y_{ik} = 1$ to encourage the model to output $y'_{ik} rightarrow 1$ leading to $-log(y'_{ik})rightarrow 0$.



          On the other hand, for the second supervision $bar{y}_{ik}=1$, we want to encourage the model to output $y'_{ik} rightarrow 0$. Therefore, loss $-log(1- y'_{ik})$ can be used to have $-log(1- y'_{ik})rightarrow 0$.



          Accordingly, second supervision can be combined with first one as follows:



          $$H_{(y,bar{y})}(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})+bar{y}_{ik}log(1-y'_{ik})$$



          Note that supervision "data $i$ does not belong to classes $1$ and $2$" is also supported. For example, $bar{y}_{i}=(1, 1, 0,...)$ activates both $-log(1 - y'_{i1})$ and $-log(1 - y'_{i2})$ to encourage the model to output less probabilities for classes $1$ and $2$, i.e. $y'_{i1} rightarrow 0$, and $y'_{i2} rightarrow 0$.






          share|improve this answer










          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            What is $y'$? Did you mix $y'$ and $bar y$?
            $endgroup$
            – Mehraban
            yesterday












          • $begingroup$
            It denotes the model prediction. No they are not mixed.
            $endgroup$
            – Esmailian
            yesterday
















          1












          $begingroup$

          This can be accomplished by a modification to multi-class cross-entropy.



          We are faced with two types of supervision. First type is "data $i$ belongs to class $k$" denoted by $y_{ik}=1$, and second type is "data $i$ does not belong to class $k$" denoted by $bar{y}_{ik}=1$. For example, for 3 classes, $y_i=(1, 0, 0)$ denotes that point $i$ belongs to class $1$, and $bar{y}_{i}=(0, 0, 1)$ denotes that point $i$ does not belong to class $3$. Let $y'_{ik} in [0, 1]$ denote the model prediction. The original cross-entropy for $K$ classes is:



          $$H_y(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})$$.



          This objective assigns loss $-log(y'_{ik})$ to $y_{ik} = 1$ to encourage the model to output $y'_{ik} rightarrow 1$ leading to $-log(y'_{ik})rightarrow 0$.



          On the other hand, for the second supervision $bar{y}_{ik}=1$, we want to encourage the model to output $y'_{ik} rightarrow 0$. Therefore, loss $-log(1- y'_{ik})$ can be used to have $-log(1- y'_{ik})rightarrow 0$.



          Accordingly, second supervision can be combined with first one as follows:



          $$H_{(y,bar{y})}(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})+bar{y}_{ik}log(1-y'_{ik})$$



          Note that supervision "data $i$ does not belong to classes $1$ and $2$" is also supported. For example, $bar{y}_{i}=(1, 1, 0,...)$ activates both $-log(1 - y'_{i1})$ and $-log(1 - y'_{i2})$ to encourage the model to output less probabilities for classes $1$ and $2$, i.e. $y'_{i1} rightarrow 0$, and $y'_{i2} rightarrow 0$.






          share|improve this answer










          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            What is $y'$? Did you mix $y'$ and $bar y$?
            $endgroup$
            – Mehraban
            yesterday












          • $begingroup$
            It denotes the model prediction. No they are not mixed.
            $endgroup$
            – Esmailian
            yesterday














          1












          1








          1





          $begingroup$

          This can be accomplished by a modification to multi-class cross-entropy.



          We are faced with two types of supervision. First type is "data $i$ belongs to class $k$" denoted by $y_{ik}=1$, and second type is "data $i$ does not belong to class $k$" denoted by $bar{y}_{ik}=1$. For example, for 3 classes, $y_i=(1, 0, 0)$ denotes that point $i$ belongs to class $1$, and $bar{y}_{i}=(0, 0, 1)$ denotes that point $i$ does not belong to class $3$. Let $y'_{ik} in [0, 1]$ denote the model prediction. The original cross-entropy for $K$ classes is:



          $$H_y(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})$$.



          This objective assigns loss $-log(y'_{ik})$ to $y_{ik} = 1$ to encourage the model to output $y'_{ik} rightarrow 1$ leading to $-log(y'_{ik})rightarrow 0$.



          On the other hand, for the second supervision $bar{y}_{ik}=1$, we want to encourage the model to output $y'_{ik} rightarrow 0$. Therefore, loss $-log(1- y'_{ik})$ can be used to have $-log(1- y'_{ik})rightarrow 0$.



          Accordingly, second supervision can be combined with first one as follows:



          $$H_{(y,bar{y})}(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})+bar{y}_{ik}log(1-y'_{ik})$$



          Note that supervision "data $i$ does not belong to classes $1$ and $2$" is also supported. For example, $bar{y}_{i}=(1, 1, 0,...)$ activates both $-log(1 - y'_{i1})$ and $-log(1 - y'_{i2})$ to encourage the model to output less probabilities for classes $1$ and $2$, i.e. $y'_{i1} rightarrow 0$, and $y'_{i2} rightarrow 0$.






          share|improve this answer










          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$



          This can be accomplished by a modification to multi-class cross-entropy.



          We are faced with two types of supervision. First type is "data $i$ belongs to class $k$" denoted by $y_{ik}=1$, and second type is "data $i$ does not belong to class $k$" denoted by $bar{y}_{ik}=1$. For example, for 3 classes, $y_i=(1, 0, 0)$ denotes that point $i$ belongs to class $1$, and $bar{y}_{i}=(0, 0, 1)$ denotes that point $i$ does not belong to class $3$. Let $y'_{ik} in [0, 1]$ denote the model prediction. The original cross-entropy for $K$ classes is:



          $$H_y(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})$$.



          This objective assigns loss $-log(y'_{ik})$ to $y_{ik} = 1$ to encourage the model to output $y'_{ik} rightarrow 1$ leading to $-log(y'_{ik})rightarrow 0$.



          On the other hand, for the second supervision $bar{y}_{ik}=1$, we want to encourage the model to output $y'_{ik} rightarrow 0$. Therefore, loss $-log(1- y'_{ik})$ can be used to have $-log(1- y'_{ik})rightarrow 0$.



          Accordingly, second supervision can be combined with first one as follows:



          $$H_{(y,bar{y})}(y')=-sum_{i}sum_{k=1}^{K}y_{ik}log(y'_{ik})+bar{y}_{ik}log(1-y'_{ik})$$



          Note that supervision "data $i$ does not belong to classes $1$ and $2$" is also supported. For example, $bar{y}_{i}=(1, 1, 0,...)$ activates both $-log(1 - y'_{i1})$ and $-log(1 - y'_{i2})$ to encourage the model to output less probabilities for classes $1$ and $2$, i.e. $y'_{i1} rightarrow 0$, and $y'_{i2} rightarrow 0$.







          share|improve this answer










          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          share|improve this answer



          share|improve this answer








          edited yesterday





















          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.









          answered yesterday









          EsmailianEsmailian

          3865




          3865




          New contributor




          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.





          New contributor





          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.












          • $begingroup$
            What is $y'$? Did you mix $y'$ and $bar y$?
            $endgroup$
            – Mehraban
            yesterday












          • $begingroup$
            It denotes the model prediction. No they are not mixed.
            $endgroup$
            – Esmailian
            yesterday


















          • $begingroup$
            What is $y'$? Did you mix $y'$ and $bar y$?
            $endgroup$
            – Mehraban
            yesterday












          • $begingroup$
            It denotes the model prediction. No they are not mixed.
            $endgroup$
            – Esmailian
            yesterday
















          $begingroup$
          What is $y'$? Did you mix $y'$ and $bar y$?
          $endgroup$
          – Mehraban
          yesterday






          $begingroup$
          What is $y'$? Did you mix $y'$ and $bar y$?
          $endgroup$
          – Mehraban
          yesterday














          $begingroup$
          It denotes the model prediction. No they are not mixed.
          $endgroup$
          – Esmailian
          yesterday




          $begingroup$
          It denotes the model prediction. No they are not mixed.
          $endgroup$
          – Esmailian
          yesterday










          Mehraban is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          Mehraban is a new contributor. Be nice, and check out our Code of Conduct.













          Mehraban is a new contributor. Be nice, and check out our Code of Conduct.












          Mehraban is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46698%2fhow-to-use-mlp-mistakes-in-a-certain-class-to-improve-its-precision-in-that-par%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to label and detect the document text images

          Tabula Rosettana

          Aureus (color)