YOLO v3 complete architecture












0












$begingroup$


I am attempting to implement YOLO v3 in Tensorflow-Keras from scratch, with the aim of training my own model on a custom dataset. By that, I mean without using pretrained weights. I have gone through all three papers for YOLOv1, YOLOv2(YOLO9000) and YOLOv3, and find that although Darknet53 is used as a feature extractor for YOLOv3, I am unable to point out the complete architecture which extends after that - the "detection" layers talked about here. After a lot of reading on blog posts from Medium, kdnuggets and other similar sites, I ended up with a few significant questions:




  • Have I have missed the complete architecture of the detection layers (that extend after Darknet53 used for feature extraction) in YOLOv3 paper somewhere?

  • The author seems to use different image sizes at different stages of training. Does the network automatically do this upscaling/downscaling of images?

  • For preprocessing the images, is it really just enough to resize them and then normalize it (dividing by 255)?


Please be kind enough to point me in the right direction. I appreciate the help!









share







New contributor




hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    0












    $begingroup$


    I am attempting to implement YOLO v3 in Tensorflow-Keras from scratch, with the aim of training my own model on a custom dataset. By that, I mean without using pretrained weights. I have gone through all three papers for YOLOv1, YOLOv2(YOLO9000) and YOLOv3, and find that although Darknet53 is used as a feature extractor for YOLOv3, I am unable to point out the complete architecture which extends after that - the "detection" layers talked about here. After a lot of reading on blog posts from Medium, kdnuggets and other similar sites, I ended up with a few significant questions:




    • Have I have missed the complete architecture of the detection layers (that extend after Darknet53 used for feature extraction) in YOLOv3 paper somewhere?

    • The author seems to use different image sizes at different stages of training. Does the network automatically do this upscaling/downscaling of images?

    • For preprocessing the images, is it really just enough to resize them and then normalize it (dividing by 255)?


    Please be kind enough to point me in the right direction. I appreciate the help!









    share







    New contributor




    hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      0












      0








      0





      $begingroup$


      I am attempting to implement YOLO v3 in Tensorflow-Keras from scratch, with the aim of training my own model on a custom dataset. By that, I mean without using pretrained weights. I have gone through all three papers for YOLOv1, YOLOv2(YOLO9000) and YOLOv3, and find that although Darknet53 is used as a feature extractor for YOLOv3, I am unable to point out the complete architecture which extends after that - the "detection" layers talked about here. After a lot of reading on blog posts from Medium, kdnuggets and other similar sites, I ended up with a few significant questions:




      • Have I have missed the complete architecture of the detection layers (that extend after Darknet53 used for feature extraction) in YOLOv3 paper somewhere?

      • The author seems to use different image sizes at different stages of training. Does the network automatically do this upscaling/downscaling of images?

      • For preprocessing the images, is it really just enough to resize them and then normalize it (dividing by 255)?


      Please be kind enough to point me in the right direction. I appreciate the help!









      share







      New contributor




      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I am attempting to implement YOLO v3 in Tensorflow-Keras from scratch, with the aim of training my own model on a custom dataset. By that, I mean without using pretrained weights. I have gone through all three papers for YOLOv1, YOLOv2(YOLO9000) and YOLOv3, and find that although Darknet53 is used as a feature extractor for YOLOv3, I am unable to point out the complete architecture which extends after that - the "detection" layers talked about here. After a lot of reading on blog posts from Medium, kdnuggets and other similar sites, I ended up with a few significant questions:




      • Have I have missed the complete architecture of the detection layers (that extend after Darknet53 used for feature extraction) in YOLOv3 paper somewhere?

      • The author seems to use different image sizes at different stages of training. Does the network automatically do this upscaling/downscaling of images?

      • For preprocessing the images, is it really just enough to resize them and then normalize it (dividing by 255)?


      Please be kind enough to point me in the right direction. I appreciate the help!







      keras tensorflow object-detection object-recognition yolo





      share







      New contributor




      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share







      New contributor




      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share



      share






      New contributor




      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 6 mins ago









      hridaynshridayns

      1012




      1012




      New contributor




      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      hridayns is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          0






          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          hridayns is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48120%2fyolo-v3-complete-architecture%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          hridayns is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          hridayns is a new contributor. Be nice, and check out our Code of Conduct.













          hridayns is a new contributor. Be nice, and check out our Code of Conduct.












          hridayns is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48120%2fyolo-v3-complete-architecture%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to label and detect the document text images

          Vallis Paradisi

          Tabula Rosettana