What is the faster way to count occurrences of equal sublists in a nested list?












6















I have a list of lists in Python and I want to (as fastly as possible : very important...) append to each sublist the number of time it appear into the nested list.



I have done that with some pandas data-frame, but this seems to be very slow and I need to run this lines on very very large scale. I am completely willing to sacrifice nice-reading code to efficient one.



So for instance my nested list is here:



l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]


I need to have:



res = [[1, 3, 2, 2], [1, 3, 5, 1]]


EDIT



Order in res does not matter at all.










share|improve this question





























    6















    I have a list of lists in Python and I want to (as fastly as possible : very important...) append to each sublist the number of time it appear into the nested list.



    I have done that with some pandas data-frame, but this seems to be very slow and I need to run this lines on very very large scale. I am completely willing to sacrifice nice-reading code to efficient one.



    So for instance my nested list is here:



    l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]


    I need to have:



    res = [[1, 3, 2, 2], [1, 3, 5, 1]]


    EDIT



    Order in res does not matter at all.










    share|improve this question



























      6












      6








      6








      I have a list of lists in Python and I want to (as fastly as possible : very important...) append to each sublist the number of time it appear into the nested list.



      I have done that with some pandas data-frame, but this seems to be very slow and I need to run this lines on very very large scale. I am completely willing to sacrifice nice-reading code to efficient one.



      So for instance my nested list is here:



      l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]


      I need to have:



      res = [[1, 3, 2, 2], [1, 3, 5, 1]]


      EDIT



      Order in res does not matter at all.










      share|improve this question
















      I have a list of lists in Python and I want to (as fastly as possible : very important...) append to each sublist the number of time it appear into the nested list.



      I have done that with some pandas data-frame, but this seems to be very slow and I need to run this lines on very very large scale. I am completely willing to sacrifice nice-reading code to efficient one.



      So for instance my nested list is here:



      l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]


      I need to have:



      res = [[1, 3, 2, 2], [1, 3, 5, 1]]


      EDIT



      Order in res does not matter at all.







      python






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 hours ago









      Muhammad Ahmad

      2,0891420




      2,0891420










      asked 2 hours ago









      Léo JoubertLéo Joubert

      134210




      134210
























          3 Answers
          3






          active

          oldest

          votes


















          5














          If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:



          from collections import Counter

          l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

          result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
          print(result)


          Output



          [[1, 3, 5, 1], [1, 3, 2, 2]]





          share|improve this answer





















          • 1





            This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

            – Chris_Rands
            2 hours ago



















          4














          This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:



          >>> from collections import Counter
          >>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
          >>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
          [[1, 3, 2, 2], [1, 3, 5, 1]]


          Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.






          share|improve this answer































            1














            If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:



            l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
            x, c = np.unique(l, axis=0, return_counts=True)
            np.vstack([x.T,c]).T

            array([[1, 3, 2, 2],
            [1, 3, 5, 1]])





            share|improve this answer























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54363705%2fwhat-is-the-faster-way-to-count-occurrences-of-equal-sublists-in-a-nested-list%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5














              If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:



              from collections import Counter

              l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

              result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
              print(result)


              Output



              [[1, 3, 5, 1], [1, 3, 2, 2]]





              share|improve this answer





















              • 1





                This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

                – Chris_Rands
                2 hours ago
















              5














              If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:



              from collections import Counter

              l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

              result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
              print(result)


              Output



              [[1, 3, 5, 1], [1, 3, 2, 2]]





              share|improve this answer





















              • 1





                This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

                – Chris_Rands
                2 hours ago














              5












              5








              5







              If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:



              from collections import Counter

              l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

              result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
              print(result)


              Output



              [[1, 3, 5, 1], [1, 3, 2, 2]]





              share|improve this answer















              If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:



              from collections import Counter

              l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

              result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
              print(result)


              Output



              [[1, 3, 5, 1], [1, 3, 2, 2]]






              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited 1 hour ago

























              answered 2 hours ago









              Daniel MesejoDaniel Mesejo

              16.6k21430




              16.6k21430








              • 1





                This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

                – Chris_Rands
                2 hours ago














              • 1





                This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

                – Chris_Rands
                2 hours ago








              1




              1





              This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

              – Chris_Rands
              2 hours ago





              This is a reasonable (although mostly cosmetic) variant of my solution, assuming Python 3 of course

              – Chris_Rands
              2 hours ago













              4














              This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:



              >>> from collections import Counter
              >>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
              >>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
              [[1, 3, 2, 2], [1, 3, 5, 1]]


              Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.






              share|improve this answer




























                4














                This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:



                >>> from collections import Counter
                >>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
                >>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
                [[1, 3, 2, 2], [1, 3, 5, 1]]


                Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.






                share|improve this answer


























                  4












                  4








                  4







                  This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:



                  >>> from collections import Counter
                  >>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
                  >>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
                  [[1, 3, 2, 2], [1, 3, 5, 1]]


                  Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.






                  share|improve this answer













                  This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:



                  >>> from collections import Counter
                  >>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
                  >>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
                  [[1, 3, 2, 2], [1, 3, 5, 1]]


                  Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 2 hours ago









                  Chris_RandsChris_Rands

                  15.9k53869




                  15.9k53869























                      1














                      If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:



                      l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
                      x, c = np.unique(l, axis=0, return_counts=True)
                      np.vstack([x.T,c]).T

                      array([[1, 3, 2, 2],
                      [1, 3, 5, 1]])





                      share|improve this answer




























                        1














                        If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:



                        l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
                        x, c = np.unique(l, axis=0, return_counts=True)
                        np.vstack([x.T,c]).T

                        array([[1, 3, 2, 2],
                        [1, 3, 5, 1]])





                        share|improve this answer


























                          1












                          1








                          1







                          If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:



                          l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
                          x, c = np.unique(l, axis=0, return_counts=True)
                          np.vstack([x.T,c]).T

                          array([[1, 3, 2, 2],
                          [1, 3, 5, 1]])





                          share|improve this answer













                          If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:



                          l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
                          x, c = np.unique(l, axis=0, return_counts=True)
                          np.vstack([x.T,c]).T

                          array([[1, 3, 2, 2],
                          [1, 3, 5, 1]])






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered 2 hours ago









                          yatuyatu

                          6,6911826




                          6,6911826






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54363705%2fwhat-is-the-faster-way-to-count-occurrences-of-equal-sublists-in-a-nested-list%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to label and detect the document text images

                              Tabula Rosettana

                              Aureus (color)