How to retain dependency between variables in PyTorch?












0












$begingroup$


I am modeling k-dimensional positions over time t = 0...T using a set of initial positions Z0 with requires_grad=True and storing the results in Z with requires_grad=False for the remaining T-1 time steps.



A simple model is
Zt = Zt-1 + e
where e is some constant noise.
Which is optimized in PyTorch using gradient descent, by moving the initial positions accordingly.



The problem is, when using Z to compute subsequent time steps for t > 1, the relation between Zt and Z0 is lost, such that the model converges significantly slower opposed to simply modeling
Zt = Z0 + t * e, where the dependency between initial positions and Zt is retained.



Note: This model is for illustrative purposes only, such that the models in question are too complex to be defined in terms of Z0, requiring the intermediary results of Z.



Accumulating gradients or retaining gradient graph does not help.










share|improve this question







New contributor




Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    0












    $begingroup$


    I am modeling k-dimensional positions over time t = 0...T using a set of initial positions Z0 with requires_grad=True and storing the results in Z with requires_grad=False for the remaining T-1 time steps.



    A simple model is
    Zt = Zt-1 + e
    where e is some constant noise.
    Which is optimized in PyTorch using gradient descent, by moving the initial positions accordingly.



    The problem is, when using Z to compute subsequent time steps for t > 1, the relation between Zt and Z0 is lost, such that the model converges significantly slower opposed to simply modeling
    Zt = Z0 + t * e, where the dependency between initial positions and Zt is retained.



    Note: This model is for illustrative purposes only, such that the models in question are too complex to be defined in terms of Z0, requiring the intermediary results of Z.



    Accumulating gradients or retaining gradient graph does not help.










    share|improve this question







    New contributor




    Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      0












      0








      0





      $begingroup$


      I am modeling k-dimensional positions over time t = 0...T using a set of initial positions Z0 with requires_grad=True and storing the results in Z with requires_grad=False for the remaining T-1 time steps.



      A simple model is
      Zt = Zt-1 + e
      where e is some constant noise.
      Which is optimized in PyTorch using gradient descent, by moving the initial positions accordingly.



      The problem is, when using Z to compute subsequent time steps for t > 1, the relation between Zt and Z0 is lost, such that the model converges significantly slower opposed to simply modeling
      Zt = Z0 + t * e, where the dependency between initial positions and Zt is retained.



      Note: This model is for illustrative purposes only, such that the models in question are too complex to be defined in terms of Z0, requiring the intermediary results of Z.



      Accumulating gradients or retaining gradient graph does not help.










      share|improve this question







      New contributor




      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I am modeling k-dimensional positions over time t = 0...T using a set of initial positions Z0 with requires_grad=True and storing the results in Z with requires_grad=False for the remaining T-1 time steps.



      A simple model is
      Zt = Zt-1 + e
      where e is some constant noise.
      Which is optimized in PyTorch using gradient descent, by moving the initial positions accordingly.



      The problem is, when using Z to compute subsequent time steps for t > 1, the relation between Zt and Z0 is lost, such that the model converges significantly slower opposed to simply modeling
      Zt = Z0 + t * e, where the dependency between initial positions and Zt is retained.



      Note: This model is for illustrative purposes only, such that the models in question are too complex to be defined in terms of Z0, requiring the intermediary results of Z.



      Accumulating gradients or retaining gradient graph does not help.







      machine-learning python gradient-descent pytorch






      share|improve this question







      New contributor




      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked yesterday









      HelgeHelge

      1




      1




      New contributor




      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Helge is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          Helge is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49127%2fhow-to-retain-dependency-between-variables-in-pytorch%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          Helge is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          Helge is a new contributor. Be nice, and check out our Code of Conduct.













          Helge is a new contributor. Be nice, and check out our Code of Conduct.












          Helge is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f49127%2fhow-to-retain-dependency-between-variables-in-pytorch%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Callistus I

          Tabula Rosettana

          How to label and detect the document text images