Betweenness centrality formula












3












$begingroup$



Betweenness centrality is defined as the number of shortest paths that go through a node in the graph.The formula is:



$$sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$$



Where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.




However it doesn't seem to me that the formula calculates what is defined. Why do we divide by the total number of shortest paths between $s$ and $t$ each time? Shouldn't we just divide by $2$ to compensate the fact that $s$ and $t$ will appear twice in different orders?










share|cite









$endgroup$

















    3












    $begingroup$



    Betweenness centrality is defined as the number of shortest paths that go through a node in the graph.The formula is:



    $$sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$$



    Where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.




    However it doesn't seem to me that the formula calculates what is defined. Why do we divide by the total number of shortest paths between $s$ and $t$ each time? Shouldn't we just divide by $2$ to compensate the fact that $s$ and $t$ will appear twice in different orders?










    share|cite









    $endgroup$















      3












      3








      3





      $begingroup$



      Betweenness centrality is defined as the number of shortest paths that go through a node in the graph.The formula is:



      $$sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$$



      Where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.




      However it doesn't seem to me that the formula calculates what is defined. Why do we divide by the total number of shortest paths between $s$ and $t$ each time? Shouldn't we just divide by $2$ to compensate the fact that $s$ and $t$ will appear twice in different orders?










      share|cite









      $endgroup$





      Betweenness centrality is defined as the number of shortest paths that go through a node in the graph.The formula is:



      $$sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$$



      Where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.




      However it doesn't seem to me that the formula calculates what is defined. Why do we divide by the total number of shortest paths between $s$ and $t$ each time? Shouldn't we just divide by $2$ to compensate the fact that $s$ and $t$ will appear twice in different orders?







      graph-theory






      share|cite













      share|cite











      share|cite




      share|cite










      asked 3 hours ago









      ElooEloo

      515




      515






















          2 Answers
          2






          active

          oldest

          votes


















          2












          $begingroup$


          However it doesn't seem to me that the formula calculates what is defined.




          The formula is right. The betweenness centrality is a value in an interval $[0, ldots, 1]$. Thus, if the betweenness centrality of node $v$ is equal to $1$, then all shortest paths between two nodes of this graph pass through $v$. I will explain the correctness of this summation below.






          Why do we divide by the total number of shortest paths between s and t each time?




          You are developing a summation of the percentages. This is needed to ensure that this sum will never exceed $1$. Suppose that you have $m$ different $s$-$t$ pairs of vertices in your graph. Thus, $sigma_{st} = m$ and your summation goes through all $m$ $s$-$t$ pairs.

          One can note that the term $sigma_{st}(v)$ on this equation is binary (the shortest $s$-$t$ path passes through $v$ or not). Thus, if all $s$-$t$ paths go through $v$, you will have $m cdot frac{1}{m} = 1$.






          Shouldn't we just divide by 2 to compensate the fact that s and t will appear twice in different orders?




          Indirectly, you're right. This formula measures the percentage of the shortest $s$-$t$ paths that pass through node $v$. In fact, a simple optimization of this algorithm for undirected graphs is to consider only $s$-$t$ paths where $s < t$. However, you can't divide it by $2$.





          Curiosity: The only graph topology who has a node with betweenness centrality equal to $1$ is a star graph, like the examples shown in the figure below.



          Examples of star graphs






          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
            $endgroup$
            – Apass.Jack
            20 mins ago





















          2












          $begingroup$

          Suppose we want to quantify the extent to which $v$ is between $s$ and $t$. There could be a few ways.



          One way to describe that extent is the probability of passing through $v$ if we want to reach from $s$ to $t$ by a randomly-selected shortest path. Assume each shortest is selected with equal probability, we will get $frac{sigma_{st}(v)}{sigma_{st}}$, where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.



          Assigning the same weight to each pair of starting vertex and destination vertex, we can see that $sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$ measure the extent in which $v$ is the center of betweenness.



          enter image description hereThe graph is created by https://graphonline.ru/



          If you use $frac{sigma_{st}(v)}2$ to quantify the extent to which $v$ is between $s$ and $t$, there is no problem if you just care about $v$ considering $s$ and $t$ as fixed. However, take a look at the above graph.




          • How much is $v_3$ between $v_0$ and $v_4$? There are 3 shortest paths from $v_0$ to $v_4$, 2 of which pass through $v_3$. We get $frac{sigma_{v_0v_4}(V_3)}2 = 2/2=1$.

          • How much is $v_5$ between $v_0$ and $v_6$? There is only 1 shortest path from $v_0$ to $v_6$, which passes $v_5$. We get $frac{sigma_{v_0v_6}(v_5)}2 = 1/2=0.5$.


          Since $1>0.5$, we would like to conclude that $v_3$ is more between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. However, we can go to $v_4$ without passing $v_3$ while we must pass $v_5$ to reach $v_6$ by shortest path. So $v_3$ should be less between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. This simple example show that dividing by 2 is not the right way to normalize the measurement.








          share|cite|improve this answer









          $endgroup$














            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "419"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f108582%2fbetweenness-centrality-formula%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2












            $begingroup$


            However it doesn't seem to me that the formula calculates what is defined.




            The formula is right. The betweenness centrality is a value in an interval $[0, ldots, 1]$. Thus, if the betweenness centrality of node $v$ is equal to $1$, then all shortest paths between two nodes of this graph pass through $v$. I will explain the correctness of this summation below.






            Why do we divide by the total number of shortest paths between s and t each time?




            You are developing a summation of the percentages. This is needed to ensure that this sum will never exceed $1$. Suppose that you have $m$ different $s$-$t$ pairs of vertices in your graph. Thus, $sigma_{st} = m$ and your summation goes through all $m$ $s$-$t$ pairs.

            One can note that the term $sigma_{st}(v)$ on this equation is binary (the shortest $s$-$t$ path passes through $v$ or not). Thus, if all $s$-$t$ paths go through $v$, you will have $m cdot frac{1}{m} = 1$.






            Shouldn't we just divide by 2 to compensate the fact that s and t will appear twice in different orders?




            Indirectly, you're right. This formula measures the percentage of the shortest $s$-$t$ paths that pass through node $v$. In fact, a simple optimization of this algorithm for undirected graphs is to consider only $s$-$t$ paths where $s < t$. However, you can't divide it by $2$.





            Curiosity: The only graph topology who has a node with betweenness centrality equal to $1$ is a star graph, like the examples shown in the figure below.



            Examples of star graphs






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
              $endgroup$
              – Apass.Jack
              20 mins ago


















            2












            $begingroup$


            However it doesn't seem to me that the formula calculates what is defined.




            The formula is right. The betweenness centrality is a value in an interval $[0, ldots, 1]$. Thus, if the betweenness centrality of node $v$ is equal to $1$, then all shortest paths between two nodes of this graph pass through $v$. I will explain the correctness of this summation below.






            Why do we divide by the total number of shortest paths between s and t each time?




            You are developing a summation of the percentages. This is needed to ensure that this sum will never exceed $1$. Suppose that you have $m$ different $s$-$t$ pairs of vertices in your graph. Thus, $sigma_{st} = m$ and your summation goes through all $m$ $s$-$t$ pairs.

            One can note that the term $sigma_{st}(v)$ on this equation is binary (the shortest $s$-$t$ path passes through $v$ or not). Thus, if all $s$-$t$ paths go through $v$, you will have $m cdot frac{1}{m} = 1$.






            Shouldn't we just divide by 2 to compensate the fact that s and t will appear twice in different orders?




            Indirectly, you're right. This formula measures the percentage of the shortest $s$-$t$ paths that pass through node $v$. In fact, a simple optimization of this algorithm for undirected graphs is to consider only $s$-$t$ paths where $s < t$. However, you can't divide it by $2$.





            Curiosity: The only graph topology who has a node with betweenness centrality equal to $1$ is a star graph, like the examples shown in the figure below.



            Examples of star graphs






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
              $endgroup$
              – Apass.Jack
              20 mins ago
















            2












            2








            2





            $begingroup$


            However it doesn't seem to me that the formula calculates what is defined.




            The formula is right. The betweenness centrality is a value in an interval $[0, ldots, 1]$. Thus, if the betweenness centrality of node $v$ is equal to $1$, then all shortest paths between two nodes of this graph pass through $v$. I will explain the correctness of this summation below.






            Why do we divide by the total number of shortest paths between s and t each time?




            You are developing a summation of the percentages. This is needed to ensure that this sum will never exceed $1$. Suppose that you have $m$ different $s$-$t$ pairs of vertices in your graph. Thus, $sigma_{st} = m$ and your summation goes through all $m$ $s$-$t$ pairs.

            One can note that the term $sigma_{st}(v)$ on this equation is binary (the shortest $s$-$t$ path passes through $v$ or not). Thus, if all $s$-$t$ paths go through $v$, you will have $m cdot frac{1}{m} = 1$.






            Shouldn't we just divide by 2 to compensate the fact that s and t will appear twice in different orders?




            Indirectly, you're right. This formula measures the percentage of the shortest $s$-$t$ paths that pass through node $v$. In fact, a simple optimization of this algorithm for undirected graphs is to consider only $s$-$t$ paths where $s < t$. However, you can't divide it by $2$.





            Curiosity: The only graph topology who has a node with betweenness centrality equal to $1$ is a star graph, like the examples shown in the figure below.



            Examples of star graphs






            share|cite|improve this answer











            $endgroup$




            However it doesn't seem to me that the formula calculates what is defined.




            The formula is right. The betweenness centrality is a value in an interval $[0, ldots, 1]$. Thus, if the betweenness centrality of node $v$ is equal to $1$, then all shortest paths between two nodes of this graph pass through $v$. I will explain the correctness of this summation below.






            Why do we divide by the total number of shortest paths between s and t each time?




            You are developing a summation of the percentages. This is needed to ensure that this sum will never exceed $1$. Suppose that you have $m$ different $s$-$t$ pairs of vertices in your graph. Thus, $sigma_{st} = m$ and your summation goes through all $m$ $s$-$t$ pairs.

            One can note that the term $sigma_{st}(v)$ on this equation is binary (the shortest $s$-$t$ path passes through $v$ or not). Thus, if all $s$-$t$ paths go through $v$, you will have $m cdot frac{1}{m} = 1$.






            Shouldn't we just divide by 2 to compensate the fact that s and t will appear twice in different orders?




            Indirectly, you're right. This formula measures the percentage of the shortest $s$-$t$ paths that pass through node $v$. In fact, a simple optimization of this algorithm for undirected graphs is to consider only $s$-$t$ paths where $s < t$. However, you can't divide it by $2$.





            Curiosity: The only graph topology who has a node with betweenness centrality equal to $1$ is a star graph, like the examples shown in the figure below.



            Examples of star graphs







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited 1 hour ago

























            answered 1 hour ago









            Iago CarvalhoIago Carvalho

            17017




            17017












            • $begingroup$
              It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
              $endgroup$
              – Apass.Jack
              20 mins ago




















            • $begingroup$
              It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
              $endgroup$
              – Apass.Jack
              20 mins ago


















            $begingroup$
            It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
            $endgroup$
            – Apass.Jack
            20 mins ago






            $begingroup$
            It looks like you confuse betweenness centrality of a node in a graph with the betweenness of a node between two nodes. The former might be greater than 1 before normalization.
            $endgroup$
            – Apass.Jack
            20 mins ago













            2












            $begingroup$

            Suppose we want to quantify the extent to which $v$ is between $s$ and $t$. There could be a few ways.



            One way to describe that extent is the probability of passing through $v$ if we want to reach from $s$ to $t$ by a randomly-selected shortest path. Assume each shortest is selected with equal probability, we will get $frac{sigma_{st}(v)}{sigma_{st}}$, where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.



            Assigning the same weight to each pair of starting vertex and destination vertex, we can see that $sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$ measure the extent in which $v$ is the center of betweenness.



            enter image description hereThe graph is created by https://graphonline.ru/



            If you use $frac{sigma_{st}(v)}2$ to quantify the extent to which $v$ is between $s$ and $t$, there is no problem if you just care about $v$ considering $s$ and $t$ as fixed. However, take a look at the above graph.




            • How much is $v_3$ between $v_0$ and $v_4$? There are 3 shortest paths from $v_0$ to $v_4$, 2 of which pass through $v_3$. We get $frac{sigma_{v_0v_4}(V_3)}2 = 2/2=1$.

            • How much is $v_5$ between $v_0$ and $v_6$? There is only 1 shortest path from $v_0$ to $v_6$, which passes $v_5$. We get $frac{sigma_{v_0v_6}(v_5)}2 = 1/2=0.5$.


            Since $1>0.5$, we would like to conclude that $v_3$ is more between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. However, we can go to $v_4$ without passing $v_3$ while we must pass $v_5$ to reach $v_6$ by shortest path. So $v_3$ should be less between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. This simple example show that dividing by 2 is not the right way to normalize the measurement.








            share|cite|improve this answer









            $endgroup$


















              2












              $begingroup$

              Suppose we want to quantify the extent to which $v$ is between $s$ and $t$. There could be a few ways.



              One way to describe that extent is the probability of passing through $v$ if we want to reach from $s$ to $t$ by a randomly-selected shortest path. Assume each shortest is selected with equal probability, we will get $frac{sigma_{st}(v)}{sigma_{st}}$, where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.



              Assigning the same weight to each pair of starting vertex and destination vertex, we can see that $sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$ measure the extent in which $v$ is the center of betweenness.



              enter image description hereThe graph is created by https://graphonline.ru/



              If you use $frac{sigma_{st}(v)}2$ to quantify the extent to which $v$ is between $s$ and $t$, there is no problem if you just care about $v$ considering $s$ and $t$ as fixed. However, take a look at the above graph.




              • How much is $v_3$ between $v_0$ and $v_4$? There are 3 shortest paths from $v_0$ to $v_4$, 2 of which pass through $v_3$. We get $frac{sigma_{v_0v_4}(V_3)}2 = 2/2=1$.

              • How much is $v_5$ between $v_0$ and $v_6$? There is only 1 shortest path from $v_0$ to $v_6$, which passes $v_5$. We get $frac{sigma_{v_0v_6}(v_5)}2 = 1/2=0.5$.


              Since $1>0.5$, we would like to conclude that $v_3$ is more between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. However, we can go to $v_4$ without passing $v_3$ while we must pass $v_5$ to reach $v_6$ by shortest path. So $v_3$ should be less between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. This simple example show that dividing by 2 is not the right way to normalize the measurement.








              share|cite|improve this answer









              $endgroup$
















                2












                2








                2





                $begingroup$

                Suppose we want to quantify the extent to which $v$ is between $s$ and $t$. There could be a few ways.



                One way to describe that extent is the probability of passing through $v$ if we want to reach from $s$ to $t$ by a randomly-selected shortest path. Assume each shortest is selected with equal probability, we will get $frac{sigma_{st}(v)}{sigma_{st}}$, where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.



                Assigning the same weight to each pair of starting vertex and destination vertex, we can see that $sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$ measure the extent in which $v$ is the center of betweenness.



                enter image description hereThe graph is created by https://graphonline.ru/



                If you use $frac{sigma_{st}(v)}2$ to quantify the extent to which $v$ is between $s$ and $t$, there is no problem if you just care about $v$ considering $s$ and $t$ as fixed. However, take a look at the above graph.




                • How much is $v_3$ between $v_0$ and $v_4$? There are 3 shortest paths from $v_0$ to $v_4$, 2 of which pass through $v_3$. We get $frac{sigma_{v_0v_4}(V_3)}2 = 2/2=1$.

                • How much is $v_5$ between $v_0$ and $v_6$? There is only 1 shortest path from $v_0$ to $v_6$, which passes $v_5$. We get $frac{sigma_{v_0v_6}(v_5)}2 = 1/2=0.5$.


                Since $1>0.5$, we would like to conclude that $v_3$ is more between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. However, we can go to $v_4$ without passing $v_3$ while we must pass $v_5$ to reach $v_6$ by shortest path. So $v_3$ should be less between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. This simple example show that dividing by 2 is not the right way to normalize the measurement.








                share|cite|improve this answer









                $endgroup$



                Suppose we want to quantify the extent to which $v$ is between $s$ and $t$. There could be a few ways.



                One way to describe that extent is the probability of passing through $v$ if we want to reach from $s$ to $t$ by a randomly-selected shortest path. Assume each shortest is selected with equal probability, we will get $frac{sigma_{st}(v)}{sigma_{st}}$, where $sigma_{st}$ is the total number of shortest paths from node $s$ to node $t$ and $sigma _{st}(v)$ is the number of those paths that pass through $v$.



                Assigning the same weight to each pair of starting vertex and destination vertex, we can see that $sum_{s neq v neq t} frac{sigma_{st}(v)}{sigma_{st}}$ measure the extent in which $v$ is the center of betweenness.



                enter image description hereThe graph is created by https://graphonline.ru/



                If you use $frac{sigma_{st}(v)}2$ to quantify the extent to which $v$ is between $s$ and $t$, there is no problem if you just care about $v$ considering $s$ and $t$ as fixed. However, take a look at the above graph.




                • How much is $v_3$ between $v_0$ and $v_4$? There are 3 shortest paths from $v_0$ to $v_4$, 2 of which pass through $v_3$. We get $frac{sigma_{v_0v_4}(V_3)}2 = 2/2=1$.

                • How much is $v_5$ between $v_0$ and $v_6$? There is only 1 shortest path from $v_0$ to $v_6$, which passes $v_5$. We get $frac{sigma_{v_0v_6}(v_5)}2 = 1/2=0.5$.


                Since $1>0.5$, we would like to conclude that $v_3$ is more between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. However, we can go to $v_4$ without passing $v_3$ while we must pass $v_5$ to reach $v_6$ by shortest path. So $v_3$ should be less between $v_0$ and $v_4$ than $v_5$ is between $v_0$ and $v_6$. This simple example show that dividing by 2 is not the right way to normalize the measurement.









                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 26 mins ago









                Apass.JackApass.Jack

                14.6k1940




                14.6k1940






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Computer Science Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f108582%2fbetweenness-centrality-formula%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to label and detect the document text images

                    Vallis Paradisi

                    Tabula Rosettana