Why Do My Nonclustered Indexes Use More Space When I Delete Rows?












7















I have a large table with 7.5 billion rows and 5 indexes.
When I delete roughly 10 million rows, I notice that the nonclustered indexes seem to increase the number of pages they're stored on.



I wrote a query against dm_db_partition_stats to report the difference (after - before) in pages:



dm_db_partition_stats deltas



Index 1 is the clustered index, Index 2 is the primary key. The others are nonclustered and non-unique.



Why are the pages increasing on those non-clustered indexes?

I expected the numbers to at worst stay the same.

I do see performance counters report an increase in page-splits during the delete.



When deleting, does the ghost record have to move to another page?
Does this have to do with "uniqueifiers"?



We are in the middle of rolling out RCSI, but right now, RCSI is off.



It is a primary node in an availability group. I know that snapshot is used somehow on secondaries. I'd be surprised if that was relevant. I plan to dig into this (looking dbcc page output) to learn more. Here's hoping someone has seen something similar.










share|improve this question

























  • Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

    – Laughing Vergil
    1 hour ago











  • Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

    – Michael J Swart
    1 hour ago
















7















I have a large table with 7.5 billion rows and 5 indexes.
When I delete roughly 10 million rows, I notice that the nonclustered indexes seem to increase the number of pages they're stored on.



I wrote a query against dm_db_partition_stats to report the difference (after - before) in pages:



dm_db_partition_stats deltas



Index 1 is the clustered index, Index 2 is the primary key. The others are nonclustered and non-unique.



Why are the pages increasing on those non-clustered indexes?

I expected the numbers to at worst stay the same.

I do see performance counters report an increase in page-splits during the delete.



When deleting, does the ghost record have to move to another page?
Does this have to do with "uniqueifiers"?



We are in the middle of rolling out RCSI, but right now, RCSI is off.



It is a primary node in an availability group. I know that snapshot is used somehow on secondaries. I'd be surprised if that was relevant. I plan to dig into this (looking dbcc page output) to learn more. Here's hoping someone has seen something similar.










share|improve this question

























  • Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

    – Laughing Vergil
    1 hour ago











  • Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

    – Michael J Swart
    1 hour ago














7












7








7


1






I have a large table with 7.5 billion rows and 5 indexes.
When I delete roughly 10 million rows, I notice that the nonclustered indexes seem to increase the number of pages they're stored on.



I wrote a query against dm_db_partition_stats to report the difference (after - before) in pages:



dm_db_partition_stats deltas



Index 1 is the clustered index, Index 2 is the primary key. The others are nonclustered and non-unique.



Why are the pages increasing on those non-clustered indexes?

I expected the numbers to at worst stay the same.

I do see performance counters report an increase in page-splits during the delete.



When deleting, does the ghost record have to move to another page?
Does this have to do with "uniqueifiers"?



We are in the middle of rolling out RCSI, but right now, RCSI is off.



It is a primary node in an availability group. I know that snapshot is used somehow on secondaries. I'd be surprised if that was relevant. I plan to dig into this (looking dbcc page output) to learn more. Here's hoping someone has seen something similar.










share|improve this question
















I have a large table with 7.5 billion rows and 5 indexes.
When I delete roughly 10 million rows, I notice that the nonclustered indexes seem to increase the number of pages they're stored on.



I wrote a query against dm_db_partition_stats to report the difference (after - before) in pages:



dm_db_partition_stats deltas



Index 1 is the clustered index, Index 2 is the primary key. The others are nonclustered and non-unique.



Why are the pages increasing on those non-clustered indexes?

I expected the numbers to at worst stay the same.

I do see performance counters report an increase in page-splits during the delete.



When deleting, does the ghost record have to move to another page?
Does this have to do with "uniqueifiers"?



We are in the middle of rolling out RCSI, but right now, RCSI is off.



It is a primary node in an availability group. I know that snapshot is used somehow on secondaries. I'd be surprised if that was relevant. I plan to dig into this (looking dbcc page output) to learn more. Here's hoping someone has seen something similar.







sql-server sql-server-2014 nonclustered-index






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 1 hour ago









Paul White

52k14278450




52k14278450










asked 2 hours ago









Michael J SwartMichael J Swart

1,05851225




1,05851225













  • Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

    – Laughing Vergil
    1 hour ago











  • Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

    – Michael J Swart
    1 hour ago



















  • Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

    – Laughing Vergil
    1 hour ago











  • Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

    – Michael J Swart
    1 hour ago

















Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

– Laughing Vergil
1 hour ago





Just a question - running a REORGANIZE on one of the indexes that grew, what happens? How many pages are removed? And if you Reorganize before deleting, what happens? I'm mostly thinking that the internal mechanisms might find it easier in some cases to allocate an entire new page and merge, but doesn't clean up the empty pages. I know that REORGANIZE ends up dropping significant amounts of pages, even on relatively unfragmented but larger indexes.

– Laughing Vergil
1 hour ago













Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

– Michael J Swart
1 hour ago





Good question @LaughingVergil When I have the answer, I'll come back here to report it. (But it may take a while).

– Michael J Swart
1 hour ago










1 Answer
1






active

oldest

votes


















4














One possible scenario that very much amuses me:




  • The rows were originally written when the database didn't have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled

  • RCSI or SI was enabled, or the database was added into an Availability Group

  • During the deletions, a 14-byte timestamp was added to the deleted rows to support RCSI/SI/AG reads


Since this server is a primary in an AG, it's affected just like the secondaries are. The version info is added on the primary - the data pages are the exact same on both the primaries and secondaries. The secondaries leverage the version store to do their reads while the rows are being updated by the AG, but the secondaries don't write their own versions of the timestamp to the page. They just inherit the versions from the primary's work.



To demonstrate the growth, I took the Stack Overflow database export (which doesn't have RCSI enabled) and created a bunch of indexes on the Posts table. I checked index sizes with sp_BlitzIndex @Mode = 2 (copy/pasted into a spreadsheet, and cleaned up a little to maximize info density):



sp_BlitzIndex before



I then deleted about half of the rows:



BEGIN TRAN;
DELETE dbo.Posts WHERE Id % 2 = 0;
GO


Amusingly, while the deletes were happening, the data file was growing to accommodate the timestamps, too! The SSMS Disk Usage Report shows the growth events - here's just the top to illustrate:



Growth events



(Gotta love a demo where deletes make the database grow.) While the delete was running, I ran sp_BlitzIndex again. Note that the clustered index has less rows, but its size has already grown by about 1.5GB. The nonclustered indexes on AcceptedAnswerId have grown dramatically - they're indexes on a small value that's mostly null, so their index sizes have nearly doubled!



sp_BlitzIndex during deletion



I don't have to wait for the deletion to finish to prove that out, so I'll stop the demo there. Point being: when you do big deletions on a table that was implemented before RCSI, SI, or AGs were enabled, the indexes (including the clustered) can actually grow to accommodate the addition of the version store timestamp.






share|improve this answer

























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229796%2fwhy-do-my-nonclustered-indexes-use-more-space-when-i-delete-rows%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4














    One possible scenario that very much amuses me:




    • The rows were originally written when the database didn't have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled

    • RCSI or SI was enabled, or the database was added into an Availability Group

    • During the deletions, a 14-byte timestamp was added to the deleted rows to support RCSI/SI/AG reads


    Since this server is a primary in an AG, it's affected just like the secondaries are. The version info is added on the primary - the data pages are the exact same on both the primaries and secondaries. The secondaries leverage the version store to do their reads while the rows are being updated by the AG, but the secondaries don't write their own versions of the timestamp to the page. They just inherit the versions from the primary's work.



    To demonstrate the growth, I took the Stack Overflow database export (which doesn't have RCSI enabled) and created a bunch of indexes on the Posts table. I checked index sizes with sp_BlitzIndex @Mode = 2 (copy/pasted into a spreadsheet, and cleaned up a little to maximize info density):



    sp_BlitzIndex before



    I then deleted about half of the rows:



    BEGIN TRAN;
    DELETE dbo.Posts WHERE Id % 2 = 0;
    GO


    Amusingly, while the deletes were happening, the data file was growing to accommodate the timestamps, too! The SSMS Disk Usage Report shows the growth events - here's just the top to illustrate:



    Growth events



    (Gotta love a demo where deletes make the database grow.) While the delete was running, I ran sp_BlitzIndex again. Note that the clustered index has less rows, but its size has already grown by about 1.5GB. The nonclustered indexes on AcceptedAnswerId have grown dramatically - they're indexes on a small value that's mostly null, so their index sizes have nearly doubled!



    sp_BlitzIndex during deletion



    I don't have to wait for the deletion to finish to prove that out, so I'll stop the demo there. Point being: when you do big deletions on a table that was implemented before RCSI, SI, or AGs were enabled, the indexes (including the clustered) can actually grow to accommodate the addition of the version store timestamp.






    share|improve this answer






























      4














      One possible scenario that very much amuses me:




      • The rows were originally written when the database didn't have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled

      • RCSI or SI was enabled, or the database was added into an Availability Group

      • During the deletions, a 14-byte timestamp was added to the deleted rows to support RCSI/SI/AG reads


      Since this server is a primary in an AG, it's affected just like the secondaries are. The version info is added on the primary - the data pages are the exact same on both the primaries and secondaries. The secondaries leverage the version store to do their reads while the rows are being updated by the AG, but the secondaries don't write their own versions of the timestamp to the page. They just inherit the versions from the primary's work.



      To demonstrate the growth, I took the Stack Overflow database export (which doesn't have RCSI enabled) and created a bunch of indexes on the Posts table. I checked index sizes with sp_BlitzIndex @Mode = 2 (copy/pasted into a spreadsheet, and cleaned up a little to maximize info density):



      sp_BlitzIndex before



      I then deleted about half of the rows:



      BEGIN TRAN;
      DELETE dbo.Posts WHERE Id % 2 = 0;
      GO


      Amusingly, while the deletes were happening, the data file was growing to accommodate the timestamps, too! The SSMS Disk Usage Report shows the growth events - here's just the top to illustrate:



      Growth events



      (Gotta love a demo where deletes make the database grow.) While the delete was running, I ran sp_BlitzIndex again. Note that the clustered index has less rows, but its size has already grown by about 1.5GB. The nonclustered indexes on AcceptedAnswerId have grown dramatically - they're indexes on a small value that's mostly null, so their index sizes have nearly doubled!



      sp_BlitzIndex during deletion



      I don't have to wait for the deletion to finish to prove that out, so I'll stop the demo there. Point being: when you do big deletions on a table that was implemented before RCSI, SI, or AGs were enabled, the indexes (including the clustered) can actually grow to accommodate the addition of the version store timestamp.






      share|improve this answer




























        4












        4








        4







        One possible scenario that very much amuses me:




        • The rows were originally written when the database didn't have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled

        • RCSI or SI was enabled, or the database was added into an Availability Group

        • During the deletions, a 14-byte timestamp was added to the deleted rows to support RCSI/SI/AG reads


        Since this server is a primary in an AG, it's affected just like the secondaries are. The version info is added on the primary - the data pages are the exact same on both the primaries and secondaries. The secondaries leverage the version store to do their reads while the rows are being updated by the AG, but the secondaries don't write their own versions of the timestamp to the page. They just inherit the versions from the primary's work.



        To demonstrate the growth, I took the Stack Overflow database export (which doesn't have RCSI enabled) and created a bunch of indexes on the Posts table. I checked index sizes with sp_BlitzIndex @Mode = 2 (copy/pasted into a spreadsheet, and cleaned up a little to maximize info density):



        sp_BlitzIndex before



        I then deleted about half of the rows:



        BEGIN TRAN;
        DELETE dbo.Posts WHERE Id % 2 = 0;
        GO


        Amusingly, while the deletes were happening, the data file was growing to accommodate the timestamps, too! The SSMS Disk Usage Report shows the growth events - here's just the top to illustrate:



        Growth events



        (Gotta love a demo where deletes make the database grow.) While the delete was running, I ran sp_BlitzIndex again. Note that the clustered index has less rows, but its size has already grown by about 1.5GB. The nonclustered indexes on AcceptedAnswerId have grown dramatically - they're indexes on a small value that's mostly null, so their index sizes have nearly doubled!



        sp_BlitzIndex during deletion



        I don't have to wait for the deletion to finish to prove that out, so I'll stop the demo there. Point being: when you do big deletions on a table that was implemented before RCSI, SI, or AGs were enabled, the indexes (including the clustered) can actually grow to accommodate the addition of the version store timestamp.






        share|improve this answer















        One possible scenario that very much amuses me:




        • The rows were originally written when the database didn't have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled

        • RCSI or SI was enabled, or the database was added into an Availability Group

        • During the deletions, a 14-byte timestamp was added to the deleted rows to support RCSI/SI/AG reads


        Since this server is a primary in an AG, it's affected just like the secondaries are. The version info is added on the primary - the data pages are the exact same on both the primaries and secondaries. The secondaries leverage the version store to do their reads while the rows are being updated by the AG, but the secondaries don't write their own versions of the timestamp to the page. They just inherit the versions from the primary's work.



        To demonstrate the growth, I took the Stack Overflow database export (which doesn't have RCSI enabled) and created a bunch of indexes on the Posts table. I checked index sizes with sp_BlitzIndex @Mode = 2 (copy/pasted into a spreadsheet, and cleaned up a little to maximize info density):



        sp_BlitzIndex before



        I then deleted about half of the rows:



        BEGIN TRAN;
        DELETE dbo.Posts WHERE Id % 2 = 0;
        GO


        Amusingly, while the deletes were happening, the data file was growing to accommodate the timestamps, too! The SSMS Disk Usage Report shows the growth events - here's just the top to illustrate:



        Growth events



        (Gotta love a demo where deletes make the database grow.) While the delete was running, I ran sp_BlitzIndex again. Note that the clustered index has less rows, but its size has already grown by about 1.5GB. The nonclustered indexes on AcceptedAnswerId have grown dramatically - they're indexes on a small value that's mostly null, so their index sizes have nearly doubled!



        sp_BlitzIndex during deletion



        I don't have to wait for the deletion to finish to prove that out, so I'll stop the demo there. Point being: when you do big deletions on a table that was implemented before RCSI, SI, or AGs were enabled, the indexes (including the clustered) can actually grow to accommodate the addition of the version store timestamp.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 57 mins ago

























        answered 1 hour ago









        Brent OzarBrent Ozar

        34.7k19103234




        34.7k19103234






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229796%2fwhy-do-my-nonclustered-indexes-use-more-space-when-i-delete-rows%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Callistus I

            Tabula Rosettana

            How to label and detect the document text images