policy gradient loss [on hold]












2












$begingroup$


I am confused with the process for calculating loss. My code is below:



logits = policy.predictions(states) 
negative_likelihoods = tf.nn.softmax_cross_entropy_with_logits(labels=**actions**, logits=logits)

weighted_negative_likelihoods = tf.multiply(negative_likelihoods, q_values)

loss = tf.reduce_mean(weighted_negative_likelihoods)

gradients = loss.gradients(loss, variables)


logits is the output of policy network without softmax.



My question is :



What does actions mean ? Is it the action that agent has executed at t step or it should execute at t step ?
Thanks










share|improve this question









New contributor




Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$



put on hold as unclear what you're asking by Sean Owen yesterday


Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.























    2












    $begingroup$


    I am confused with the process for calculating loss. My code is below:



    logits = policy.predictions(states) 
    negative_likelihoods = tf.nn.softmax_cross_entropy_with_logits(labels=**actions**, logits=logits)

    weighted_negative_likelihoods = tf.multiply(negative_likelihoods, q_values)

    loss = tf.reduce_mean(weighted_negative_likelihoods)

    gradients = loss.gradients(loss, variables)


    logits is the output of policy network without softmax.



    My question is :



    What does actions mean ? Is it the action that agent has executed at t step or it should execute at t step ?
    Thanks










    share|improve this question









    New contributor




    Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$



    put on hold as unclear what you're asking by Sean Owen yesterday


    Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.





















      2












      2








      2





      $begingroup$


      I am confused with the process for calculating loss. My code is below:



      logits = policy.predictions(states) 
      negative_likelihoods = tf.nn.softmax_cross_entropy_with_logits(labels=**actions**, logits=logits)

      weighted_negative_likelihoods = tf.multiply(negative_likelihoods, q_values)

      loss = tf.reduce_mean(weighted_negative_likelihoods)

      gradients = loss.gradients(loss, variables)


      logits is the output of policy network without softmax.



      My question is :



      What does actions mean ? Is it the action that agent has executed at t step or it should execute at t step ?
      Thanks










      share|improve this question









      New contributor




      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I am confused with the process for calculating loss. My code is below:



      logits = policy.predictions(states) 
      negative_likelihoods = tf.nn.softmax_cross_entropy_with_logits(labels=**actions**, logits=logits)

      weighted_negative_likelihoods = tf.multiply(negative_likelihoods, q_values)

      loss = tf.reduce_mean(weighted_negative_likelihoods)

      gradients = loss.gradients(loss, variables)


      logits is the output of policy network without softmax.



      My question is :



      What does actions mean ? Is it the action that agent has executed at t step or it should execute at t step ?
      Thanks







      python tensorflow loss-function policy-gradients






      share|improve this question









      New contributor




      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited yesterday









      HFulcher

      1228




      1228






      New contributor




      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      Kang_KaiKang_Kai

      111




      111




      New contributor




      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Kang_Kai is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




      put on hold as unclear what you're asking by Sean Owen yesterday


      Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.









      put on hold as unclear what you're asking by Sean Owen yesterday


      Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
























          0






          active

          oldest

          votes

















          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes

          Popular posts from this blog

          Callistus I

          Tabula Rosettana

          How to label and detect the document text images