What is the difference between Inception v2 and Inception v3?












14












$begingroup$


The paper Going deeper with convolutions describes GoogleNet which contains the original inception modules:



enter image description here



The change to inception v2 was that they replaced the 5x5 convolutions by two successive 3x3 convolutions and applied pooling:



enter image description here



What is the difference between Inception v2 and Inception v3?










share|improve this question









$endgroup$












  • $begingroup$
    Is it simply batch normalization? Or does Inception v2 already have batch normalization?
    $endgroup$
    – Martin Thoma
    Nov 24 '16 at 11:16










  • $begingroup$
    github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
    $endgroup$
    – Amartya Ranjan Saikia
    Feb 13 '18 at 4:21
















14












$begingroup$


The paper Going deeper with convolutions describes GoogleNet which contains the original inception modules:



enter image description here



The change to inception v2 was that they replaced the 5x5 convolutions by two successive 3x3 convolutions and applied pooling:



enter image description here



What is the difference between Inception v2 and Inception v3?










share|improve this question









$endgroup$












  • $begingroup$
    Is it simply batch normalization? Or does Inception v2 already have batch normalization?
    $endgroup$
    – Martin Thoma
    Nov 24 '16 at 11:16










  • $begingroup$
    github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
    $endgroup$
    – Amartya Ranjan Saikia
    Feb 13 '18 at 4:21














14












14








14


2



$begingroup$


The paper Going deeper with convolutions describes GoogleNet which contains the original inception modules:



enter image description here



The change to inception v2 was that they replaced the 5x5 convolutions by two successive 3x3 convolutions and applied pooling:



enter image description here



What is the difference between Inception v2 and Inception v3?










share|improve this question









$endgroup$




The paper Going deeper with convolutions describes GoogleNet which contains the original inception modules:



enter image description here



The change to inception v2 was that they replaced the 5x5 convolutions by two successive 3x3 convolutions and applied pooling:



enter image description here



What is the difference between Inception v2 and Inception v3?







image-classification convnet computer-vision inception






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 24 '16 at 11:10









Martin ThomaMartin Thoma

6,0051353126




6,0051353126












  • $begingroup$
    Is it simply batch normalization? Or does Inception v2 already have batch normalization?
    $endgroup$
    – Martin Thoma
    Nov 24 '16 at 11:16










  • $begingroup$
    github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
    $endgroup$
    – Amartya Ranjan Saikia
    Feb 13 '18 at 4:21


















  • $begingroup$
    Is it simply batch normalization? Or does Inception v2 already have batch normalization?
    $endgroup$
    – Martin Thoma
    Nov 24 '16 at 11:16










  • $begingroup$
    github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
    $endgroup$
    – Amartya Ranjan Saikia
    Feb 13 '18 at 4:21
















$begingroup$
Is it simply batch normalization? Or does Inception v2 already have batch normalization?
$endgroup$
– Martin Thoma
Nov 24 '16 at 11:16




$begingroup$
Is it simply batch normalization? Or does Inception v2 already have batch normalization?
$endgroup$
– Martin Thoma
Nov 24 '16 at 11:16












$begingroup$
github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
$endgroup$
– Amartya Ranjan Saikia
Feb 13 '18 at 4:21




$begingroup$
github.com/SKKSaikia/CNN-GoogLeNet This repository holds all the versions of GoogLeNet and their difference. Give it a try.
$endgroup$
– Amartya Ranjan Saikia
Feb 13 '18 at 4:21










4 Answers
4






active

oldest

votes


















18












$begingroup$

In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).




The main difference to the network described in (Szegedy et al.,2014)
is that the 5x5 convolutional layers are replaced by two consecutive
layer of 3x3 convolutions with up to 128 filters.




And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.



In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.




Note that we have factorized the traditional 7x7 convolution into
three 3x3 convolutions




As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.




BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.







share|improve this answer











$endgroup$





















    2












    $begingroup$

    beside what was mentioned by daoliker



    inception v2 utilized separable convolution as first layer of depth 64




    • function usage

    • function definition

    • paper


    quote from paper




    Our model employed separable convolution with depth
    multiplier 8 on the first convolutional layer. This reduces
    the computational cost while increasing the memory consumption
    at training time.




    why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.






    share|improve this answer









    $endgroup$





















      1












      $begingroup$

      The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf



      Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).






      share|improve this answer











      $endgroup$





















        0












        $begingroup$

        Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):




        The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.







        share|improve this answer








        New contributor




        desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        $endgroup$













          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f15328%2fwhat-is-the-difference-between-inception-v2-and-inception-v3%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          4 Answers
          4






          active

          oldest

          votes








          4 Answers
          4






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          18












          $begingroup$

          In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).




          The main difference to the network described in (Szegedy et al.,2014)
          is that the 5x5 convolutional layers are replaced by two consecutive
          layer of 3x3 convolutions with up to 128 filters.




          And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.



          In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.




          Note that we have factorized the traditional 7x7 convolution into
          three 3x3 convolutions




          As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.




          BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.







          share|improve this answer











          $endgroup$


















            18












            $begingroup$

            In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).




            The main difference to the network described in (Szegedy et al.,2014)
            is that the 5x5 convolutional layers are replaced by two consecutive
            layer of 3x3 convolutions with up to 128 filters.




            And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.



            In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.




            Note that we have factorized the traditional 7x7 convolution into
            three 3x3 convolutions




            As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.




            BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.







            share|improve this answer











            $endgroup$
















              18












              18








              18





              $begingroup$

              In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).




              The main difference to the network described in (Szegedy et al.,2014)
              is that the 5x5 convolutional layers are replaced by two consecutive
              layer of 3x3 convolutions with up to 128 filters.




              And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.



              In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.




              Note that we have factorized the traditional 7x7 convolution into
              three 3x3 convolutions




              As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.




              BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.







              share|improve this answer











              $endgroup$



              In the paper Batch Normalization,Sergey et al,2015. proposed Inception-v1 architecture which is a variant of the GoogleNet in the paper Going deeper with convolutions, and in the meanwhile they introduced Batch Normalization to Inception(BN-Inception).




              The main difference to the network described in (Szegedy et al.,2014)
              is that the 5x5 convolutional layers are replaced by two consecutive
              layer of 3x3 convolutions with up to 128 filters.




              And in the paper Rethinking the Inception Architecture for Computer Vision, the authors proposed Inception-v2 and Inception-v3.



              In the Inception-v2, they introduced Factorization(factorize convolutions into smaller convolutions) and some minor change into Inception-v1.




              Note that we have factorized the traditional 7x7 convolution into
              three 3x3 convolutions




              As for Inception-v3, it is a variant of Inception-v2 which adds BN-auxiliary.




              BN auxiliary refers to the version in which the fully connected layer of the auxiliary classifier is also-normalized, not just convolutions. We are refering to the model [Inception-v2 + BN auxiliary] as Inception-v3.








              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jun 28 '17 at 17:07









              Muayyad Alsadi

              1233




              1233










              answered Feb 21 '17 at 12:11









              daolikerdaoliker

              28125




              28125























                  2












                  $begingroup$

                  beside what was mentioned by daoliker



                  inception v2 utilized separable convolution as first layer of depth 64




                  • function usage

                  • function definition

                  • paper


                  quote from paper




                  Our model employed separable convolution with depth
                  multiplier 8 on the first convolutional layer. This reduces
                  the computational cost while increasing the memory consumption
                  at training time.




                  why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.






                  share|improve this answer









                  $endgroup$


















                    2












                    $begingroup$

                    beside what was mentioned by daoliker



                    inception v2 utilized separable convolution as first layer of depth 64




                    • function usage

                    • function definition

                    • paper


                    quote from paper




                    Our model employed separable convolution with depth
                    multiplier 8 on the first convolutional layer. This reduces
                    the computational cost while increasing the memory consumption
                    at training time.




                    why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.






                    share|improve this answer









                    $endgroup$
















                      2












                      2








                      2





                      $begingroup$

                      beside what was mentioned by daoliker



                      inception v2 utilized separable convolution as first layer of depth 64




                      • function usage

                      • function definition

                      • paper


                      quote from paper




                      Our model employed separable convolution with depth
                      multiplier 8 on the first convolutional layer. This reduces
                      the computational cost while increasing the memory consumption
                      at training time.




                      why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.






                      share|improve this answer









                      $endgroup$



                      beside what was mentioned by daoliker



                      inception v2 utilized separable convolution as first layer of depth 64




                      • function usage

                      • function definition

                      • paper


                      quote from paper




                      Our model employed separable convolution with depth
                      multiplier 8 on the first convolutional layer. This reduces
                      the computational cost while increasing the memory consumption
                      at training time.




                      why this is important? because it was dropped in v3 and v4 and inception resnet, but re-introduced and heavily used in mobilenet later.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Jun 21 '17 at 22:40









                      Muayyad AlsadiMuayyad Alsadi

                      1233




                      1233























                          1












                          $begingroup$

                          The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf



                          Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).






                          share|improve this answer











                          $endgroup$


















                            1












                            $begingroup$

                            The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf



                            Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).






                            share|improve this answer











                            $endgroup$
















                              1












                              1








                              1





                              $begingroup$

                              The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf



                              Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).






                              share|improve this answer











                              $endgroup$



                              The answer can be found in the Going deeper with convolutions paper: https://arxiv.org/pdf/1512.00567v3.pdf



                              Check Table 3. Inception v2 is the architecture described in the Going deeper with convolutions paper. Inception v3 is the same architecture (minor changes) with different training algorithm (RMSprop, label smoothing regularizer, adding an auxiliary head with batch norm to improve training etc).







                              share|improve this answer














                              share|improve this answer



                              share|improve this answer








                              edited Jan 18 '17 at 4:50

























                              answered Jan 18 '17 at 4:05









                              Sid MSid M

                              112




                              112























                                  0












                                  $begingroup$

                                  Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):




                                  The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.







                                  share|improve this answer








                                  New contributor




                                  desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                  Check out our Code of Conduct.






                                  $endgroup$


















                                    0












                                    $begingroup$

                                    Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):




                                    The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.







                                    share|improve this answer








                                    New contributor




                                    desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                    Check out our Code of Conduct.






                                    $endgroup$
















                                      0












                                      0








                                      0





                                      $begingroup$

                                      Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):




                                      The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.







                                      share|improve this answer








                                      New contributor




                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.






                                      $endgroup$



                                      Actually, the answers above seem to be wrong. Indeed, it was a big mess with the naming. However, it seems that it was fixed in the paper that introduces Inception-v4 (see: "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning"):




                                      The Inception deep convolutional architecture was introduced as GoogLeNet in (Szegedy et al. 2015a), here named Inception-v1. Later the Inception architecture was refined in various ways, first by the introduction of batch normalization (Ioffe and Szegedy 2015) (Inception-v2). Later by additional factorization ideas in the third iteration (Szegedy et al. 2015b) which will be referred to as Inception-v3 in this report.








                                      share|improve this answer








                                      New contributor




                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.









                                      share|improve this answer



                                      share|improve this answer






                                      New contributor




                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.









                                      answered 50 mins ago









                                      desadesa

                                      101




                                      101




                                      New contributor




                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.





                                      New contributor





                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.






                                      desa is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                      Check out our Code of Conduct.






























                                          draft saved

                                          draft discarded




















































                                          Thanks for contributing an answer to Data Science Stack Exchange!


                                          • Please be sure to answer the question. Provide details and share your research!

                                          But avoid



                                          • Asking for help, clarification, or responding to other answers.

                                          • Making statements based on opinion; back them up with references or personal experience.


                                          Use MathJax to format equations. MathJax reference.


                                          To learn more, see our tips on writing great answers.




                                          draft saved


                                          draft discarded














                                          StackExchange.ready(
                                          function () {
                                          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f15328%2fwhat-is-the-difference-between-inception-v2-and-inception-v3%23new-answer', 'question_page');
                                          }
                                          );

                                          Post as a guest















                                          Required, but never shown





















































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown

































                                          Required, but never shown














                                          Required, but never shown












                                          Required, but never shown







                                          Required, but never shown







                                          Popular posts from this blog

                                          How to label and detect the document text images

                                          Tabula Rosettana

                                          Aureus (color)