(Codewars) Linked Lists - Remove Duplicates












2












$begingroup$


Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python



Description




Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.



var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null
removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null


If the passed in list is null/None/nil, simply return null.



Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.




My Solution



class Node(object):
def __init__(self, data, nxt=None):
self.data = data
self.next = nxt

def remove_duplicates(head):
node = Node(None, head)
st = set()
while node.next:
if node.next.data in st:
tmp = node.next
node.next = node.next.next
del tmp
else:
st.add(node.next.data)
node = node.next
return head




I am interested in performance and adhering to best practices.










share|improve this question









$endgroup$

















    2












    $begingroup$


    Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python



    Description




    Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.



    var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null
    removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null


    If the passed in list is null/None/nil, simply return null.



    Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.




    My Solution



    class Node(object):
    def __init__(self, data, nxt=None):
    self.data = data
    self.next = nxt

    def remove_duplicates(head):
    node = Node(None, head)
    st = set()
    while node.next:
    if node.next.data in st:
    tmp = node.next
    node.next = node.next.next
    del tmp
    else:
    st.add(node.next.data)
    node = node.next
    return head




    I am interested in performance and adhering to best practices.










    share|improve this question









    $endgroup$















      2












      2








      2





      $begingroup$


      Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python



      Description




      Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.



      var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null
      removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null


      If the passed in list is null/None/nil, simply return null.



      Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.




      My Solution



      class Node(object):
      def __init__(self, data, nxt=None):
      self.data = data
      self.next = nxt

      def remove_duplicates(head):
      node = Node(None, head)
      st = set()
      while node.next:
      if node.next.data in st:
      tmp = node.next
      node.next = node.next.next
      del tmp
      else:
      st.add(node.next.data)
      node = node.next
      return head




      I am interested in performance and adhering to best practices.










      share|improve this question









      $endgroup$




      Kata: https://www.codewars.com/kata/linked-lists-remove-duplicates/train/python



      Description




      Write a RemoveDuplicates() function which takes a list sorted in increasing order and deletes any duplicate nodes from the list. Ideally, the list should only be traversed once. The head of the resulting list should be returned.



      var list = 1 -> 2 -> 3 -> 3 -> 4 -> 4 -> 5 -> null
      removeDuplicates(list) === 1 -> 2 -> 3 -> 4 -> 5 -> null


      If the passed in list is null/None/nil, simply return null.



      Note: Your solution is expected to work on long lists. Recursive solutions may fail due to stack size limitations.




      My Solution



      class Node(object):
      def __init__(self, data, nxt=None):
      self.data = data
      self.next = nxt

      def remove_duplicates(head):
      node = Node(None, head)
      st = set()
      while node.next:
      if node.next.data in st:
      tmp = node.next
      node.next = node.next.next
      del tmp
      else:
      st.add(node.next.data)
      node = node.next
      return head




      I am interested in performance and adhering to best practices.







      python performance beginner programming-challenge linked-list






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 15 hours ago









      Tobi AlafinTobi Alafin

      54619




      54619






















          1 Answer
          1






          active

          oldest

          votes


















          6












          $begingroup$

          The space complexity of your solution is $O(n)$. What if I told you that given the constraints in the question, you could solve this problem in $O(1)$ space?



          Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.



          Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...



          The following should already work just fine:



          def remove_duplicates(head):
          node = Node(None, head)
          while node.next:
          if node.next.data == node.data:
          node.next = node.next.next
          else:
          node = node.next
          return head




          Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:



          def remove_duplicates(head):
          if not head:
          return head
          node = head
          while node.next:
          # Rest of the code...





          share|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
            $endgroup$
            – wizzwizz4
            10 hours ago










          • $begingroup$
            Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
            $endgroup$
            – Tobi Alafin
            10 hours ago










          • $begingroup$
            I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
            $endgroup$
            – Tobi Alafin
            10 hours ago






          • 1




            $begingroup$
            it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
            $endgroup$
            – Vogel612
            9 hours ago











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215140%2fcodewars-linked-lists-remove-duplicates%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          6












          $begingroup$

          The space complexity of your solution is $O(n)$. What if I told you that given the constraints in the question, you could solve this problem in $O(1)$ space?



          Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.



          Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...



          The following should already work just fine:



          def remove_duplicates(head):
          node = Node(None, head)
          while node.next:
          if node.next.data == node.data:
          node.next = node.next.next
          else:
          node = node.next
          return head




          Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:



          def remove_duplicates(head):
          if not head:
          return head
          node = head
          while node.next:
          # Rest of the code...





          share|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
            $endgroup$
            – wizzwizz4
            10 hours ago










          • $begingroup$
            Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
            $endgroup$
            – Tobi Alafin
            10 hours ago










          • $begingroup$
            I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
            $endgroup$
            – Tobi Alafin
            10 hours ago






          • 1




            $begingroup$
            it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
            $endgroup$
            – Vogel612
            9 hours ago
















          6












          $begingroup$

          The space complexity of your solution is $O(n)$. What if I told you that given the constraints in the question, you could solve this problem in $O(1)$ space?



          Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.



          Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...



          The following should already work just fine:



          def remove_duplicates(head):
          node = Node(None, head)
          while node.next:
          if node.next.data == node.data:
          node.next = node.next.next
          else:
          node = node.next
          return head




          Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:



          def remove_duplicates(head):
          if not head:
          return head
          node = head
          while node.next:
          # Rest of the code...





          share|improve this answer











          $endgroup$









          • 1




            $begingroup$
            Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
            $endgroup$
            – wizzwizz4
            10 hours ago










          • $begingroup$
            Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
            $endgroup$
            – Tobi Alafin
            10 hours ago










          • $begingroup$
            I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
            $endgroup$
            – Tobi Alafin
            10 hours ago






          • 1




            $begingroup$
            it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
            $endgroup$
            – Vogel612
            9 hours ago














          6












          6








          6





          $begingroup$

          The space complexity of your solution is $O(n)$. What if I told you that given the constraints in the question, you could solve this problem in $O(1)$ space?



          Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.



          Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...



          The following should already work just fine:



          def remove_duplicates(head):
          node = Node(None, head)
          while node.next:
          if node.next.data == node.data:
          node.next = node.next.next
          else:
          node = node.next
          return head




          Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:



          def remove_duplicates(head):
          if not head:
          return head
          node = head
          while node.next:
          # Rest of the code...





          share|improve this answer











          $endgroup$



          The space complexity of your solution is $O(n)$. What if I told you that given the constraints in the question, you could solve this problem in $O(1)$ space?



          Consider that you receive an already sorted list. This implies that all duplicates must be adjacent. This in turn means that for deduplicating these, you do not need to keep track of which values you already encountered.



          Note that del is not necessary here. You never use tmp aside from assigning node.next to it in any case. del is not clearing the memory allocated to the variable (or the object it refers to), it only destroys the variable itself, the object is unaffected. As such it does not have any effect on the garbage-collection that python performs...



          The following should already work just fine:



          def remove_duplicates(head):
          node = Node(None, head)
          while node.next:
          if node.next.data == node.data:
          node.next = node.next.next
          else:
          node = node.next
          return head




          Small sidenote: I really like the hack you used to ensure that you handle the head is None case, but you should be aware that it is just a hack. The cleaner solution to this would be:



          def remove_duplicates(head):
          if not head:
          return head
          node = head
          while node.next:
          # Rest of the code...






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 9 hours ago









          Tobi Alafin

          54619




          54619










          answered 14 hours ago









          Vogel612Vogel612

          21.8k447130




          21.8k447130








          • 1




            $begingroup$
            Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
            $endgroup$
            – wizzwizz4
            10 hours ago










          • $begingroup$
            Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
            $endgroup$
            – Tobi Alafin
            10 hours ago










          • $begingroup$
            I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
            $endgroup$
            – Tobi Alafin
            10 hours ago






          • 1




            $begingroup$
            it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
            $endgroup$
            – Vogel612
            9 hours ago














          • 1




            $begingroup$
            Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
            $endgroup$
            – wizzwizz4
            10 hours ago










          • $begingroup$
            Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
            $endgroup$
            – Tobi Alafin
            10 hours ago










          • $begingroup$
            I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
            $endgroup$
            – Tobi Alafin
            10 hours ago






          • 1




            $begingroup$
            it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
            $endgroup$
            – Vogel612
            9 hours ago








          1




          1




          $begingroup$
          Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
          $endgroup$
          – wizzwizz4
          10 hours ago




          $begingroup$
          Surely if head is None would be better than if not head, no? It may or may not be faster, depending on where the bottlenecks are (it's usually faster) and it more clearly conveys the intent of the code.
          $endgroup$
          – wizzwizz4
          10 hours ago












          $begingroup$
          Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
          $endgroup$
          – Tobi Alafin
          10 hours ago




          $begingroup$
          Thanks for this, yours is really neat. It still looks like it's $O(n)$ to me as you have to iterate through the list?
          $endgroup$
          – Tobi Alafin
          10 hours ago












          $begingroup$
          I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
          $endgroup$
          – Tobi Alafin
          10 hours ago




          $begingroup$
          I find special cases aesthetically unappealing which is why I used the "hack"; is that considered bad form?
          $endgroup$
          – Tobi Alafin
          10 hours ago




          1




          1




          $begingroup$
          it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
          $endgroup$
          – Vogel612
          9 hours ago




          $begingroup$
          it is in fact O(n) in time, but O(1) in space. Special cases are something I prefer to handle "separately". YMMV, but as soon as special cases become harder to reconciliate with the core algorithm, it makes more and more sense to handle them in dedicated branches
          $endgroup$
          – Vogel612
          9 hours ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f215140%2fcodewars-linked-lists-remove-duplicates%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to label and detect the document text images

          Tabula Rosettana

          Aureus (color)