Why does a belief network need to be represented using a directed acyclic graph (DAG)?












1












$begingroup$


I would have thought that it's because DAGs preserve the dependency relationships between the variables, but I am currently unsure.



Thanks










share|improve this question







New contributor




silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$

















    1












    $begingroup$


    I would have thought that it's because DAGs preserve the dependency relationships between the variables, but I am currently unsure.



    Thanks










    share|improve this question







    New contributor




    silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.







    $endgroup$















      1












      1








      1


      1



      $begingroup$


      I would have thought that it's because DAGs preserve the dependency relationships between the variables, but I am currently unsure.



      Thanks










      share|improve this question







      New contributor




      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.







      $endgroup$




      I would have thought that it's because DAGs preserve the dependency relationships between the variables, but I am currently unsure.



      Thanks







      machine-learning probability graphical-model






      share|improve this question







      New contributor




      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 2 days ago









      silverscientistsilverscientist

      61




      61




      New contributor




      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      silverscientist is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          Yes, we use DAGs to represent dependency relationships.



          We need directed graph because condition probability P(A|B) is not same as P(B|A).



          We assume it to be acyclic to get certain properties and ease calculations. You can create cyclic graph and you are going to get a lot of contradictions.



          If you ease both the restriction(directed and acyclic), we get more general model called Markov Network.






          share|improve this answer









          $endgroup$





















            0












            $begingroup$

            A Belief network is "defined" to be a DAG. A better question would be "why a distribution needs to be represented by a Belief network, i.e. a DAG?"



            Dependency relationships can be modeled with both directed-acyclic and undirected-cyclic representations. Therefore, a distribution may be represented in either ways.



            Why acyclic and directed?



            Every distribution that can be sequentially factorized into conditional probabilities as follows can be represented by a DAG:



            $P(X_1,...,X_n)=P(X_n|X_1,...,X_{n-1})color{blue}{P(X_1,...,X_{n-1})}$,
            $color{blue}{P(X_1,...,X_{n-1})}=P(X_{n-1}|X_1,...,X_{n-2})P(X_1,...,X_{n-2})$,

            ...,
            $P(X_1, X_2, X_3)=P(X_3|X_1, X_2)color{brown}{P(X_1, X_2)}$,
            $color{brown}{P(X_1, X_2)}=P(X_2|X_1)P(X_1)$.



            Belief network $G$ is built as follows: $X_n$ has no outlink, $X_{n-1}$ links to $X_n$, $X_{n-2}$ links to $X_{n-1}$ and $X_{n}$, and finally $X_1$ links to all $X_2$ to $X_n$.



            Generally, there is no unique order for factorization, thus multiple networks can represent the same distribution. For example $P(A, B)$ can be factorized as $P(A|B)P(B)$ or $P(B|A)P(A)$ which are represented with two different Bayesian networks.



            Every Bayesian network represents a sub-graph of $G$. Some directed links from $G$ are removed to introduce "independence assumptions between variables".



            How about cyclic and undirected?



            However, there is also Markov networks. Some distributions can be represented by Markov networks but not Bayesian networks [Example 3.6, Page 82, Probabilistic Graphical Models Principles and Techniques - Daphne Koller]. These networks use undirected lines to represent dependencies between random variables. A line between A and B represents [part of] a factor (a function).



            For example, $phi_1(A, B)=A.B/2=B.A/2$ is a factor of factorization $P_{theta}(A, B, C, D) = frac{1}{Z(theta)}phi_1(A, B)phi_2(B, C, D)$. Note that link $(A, B)$ cannot be unambiguously assigned a direction, since both directions are justified for $phi_1(A, B)$. As an example for cyclic relations, factor $phi(B, C, D) = B.(C+D)$ is represented with a triangle among $B$, $C$, and $D$.



            Generally speaking, cycles tend to complicate the learning and inference on Markov networks, so fewer cycles is favorable.






            share|improve this answer








            New contributor




            Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$













              Your Answer





              StackExchange.ifUsing("editor", function () {
              return StackExchange.using("mathjaxEditing", function () {
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              });
              });
              }, "mathjax-editing");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "557"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });






              silverscientist is a new contributor. Be nice, and check out our Code of Conduct.










              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46699%2fwhy-does-a-belief-network-need-to-be-represented-using-a-directed-acyclic-graph%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0












              $begingroup$

              Yes, we use DAGs to represent dependency relationships.



              We need directed graph because condition probability P(A|B) is not same as P(B|A).



              We assume it to be acyclic to get certain properties and ease calculations. You can create cyclic graph and you are going to get a lot of contradictions.



              If you ease both the restriction(directed and acyclic), we get more general model called Markov Network.






              share|improve this answer









              $endgroup$


















                0












                $begingroup$

                Yes, we use DAGs to represent dependency relationships.



                We need directed graph because condition probability P(A|B) is not same as P(B|A).



                We assume it to be acyclic to get certain properties and ease calculations. You can create cyclic graph and you are going to get a lot of contradictions.



                If you ease both the restriction(directed and acyclic), we get more general model called Markov Network.






                share|improve this answer









                $endgroup$
















                  0












                  0








                  0





                  $begingroup$

                  Yes, we use DAGs to represent dependency relationships.



                  We need directed graph because condition probability P(A|B) is not same as P(B|A).



                  We assume it to be acyclic to get certain properties and ease calculations. You can create cyclic graph and you are going to get a lot of contradictions.



                  If you ease both the restriction(directed and acyclic), we get more general model called Markov Network.






                  share|improve this answer









                  $endgroup$



                  Yes, we use DAGs to represent dependency relationships.



                  We need directed graph because condition probability P(A|B) is not same as P(B|A).



                  We assume it to be acyclic to get certain properties and ease calculations. You can create cyclic graph and you are going to get a lot of contradictions.



                  If you ease both the restriction(directed and acyclic), we get more general model called Markov Network.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 2 days ago









                  PreetPreet

                  2663




                  2663























                      0












                      $begingroup$

                      A Belief network is "defined" to be a DAG. A better question would be "why a distribution needs to be represented by a Belief network, i.e. a DAG?"



                      Dependency relationships can be modeled with both directed-acyclic and undirected-cyclic representations. Therefore, a distribution may be represented in either ways.



                      Why acyclic and directed?



                      Every distribution that can be sequentially factorized into conditional probabilities as follows can be represented by a DAG:



                      $P(X_1,...,X_n)=P(X_n|X_1,...,X_{n-1})color{blue}{P(X_1,...,X_{n-1})}$,
                      $color{blue}{P(X_1,...,X_{n-1})}=P(X_{n-1}|X_1,...,X_{n-2})P(X_1,...,X_{n-2})$,

                      ...,
                      $P(X_1, X_2, X_3)=P(X_3|X_1, X_2)color{brown}{P(X_1, X_2)}$,
                      $color{brown}{P(X_1, X_2)}=P(X_2|X_1)P(X_1)$.



                      Belief network $G$ is built as follows: $X_n$ has no outlink, $X_{n-1}$ links to $X_n$, $X_{n-2}$ links to $X_{n-1}$ and $X_{n}$, and finally $X_1$ links to all $X_2$ to $X_n$.



                      Generally, there is no unique order for factorization, thus multiple networks can represent the same distribution. For example $P(A, B)$ can be factorized as $P(A|B)P(B)$ or $P(B|A)P(A)$ which are represented with two different Bayesian networks.



                      Every Bayesian network represents a sub-graph of $G$. Some directed links from $G$ are removed to introduce "independence assumptions between variables".



                      How about cyclic and undirected?



                      However, there is also Markov networks. Some distributions can be represented by Markov networks but not Bayesian networks [Example 3.6, Page 82, Probabilistic Graphical Models Principles and Techniques - Daphne Koller]. These networks use undirected lines to represent dependencies between random variables. A line between A and B represents [part of] a factor (a function).



                      For example, $phi_1(A, B)=A.B/2=B.A/2$ is a factor of factorization $P_{theta}(A, B, C, D) = frac{1}{Z(theta)}phi_1(A, B)phi_2(B, C, D)$. Note that link $(A, B)$ cannot be unambiguously assigned a direction, since both directions are justified for $phi_1(A, B)$. As an example for cyclic relations, factor $phi(B, C, D) = B.(C+D)$ is represented with a triangle among $B$, $C$, and $D$.



                      Generally speaking, cycles tend to complicate the learning and inference on Markov networks, so fewer cycles is favorable.






                      share|improve this answer








                      New contributor




                      Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                      Check out our Code of Conduct.






                      $endgroup$


















                        0












                        $begingroup$

                        A Belief network is "defined" to be a DAG. A better question would be "why a distribution needs to be represented by a Belief network, i.e. a DAG?"



                        Dependency relationships can be modeled with both directed-acyclic and undirected-cyclic representations. Therefore, a distribution may be represented in either ways.



                        Why acyclic and directed?



                        Every distribution that can be sequentially factorized into conditional probabilities as follows can be represented by a DAG:



                        $P(X_1,...,X_n)=P(X_n|X_1,...,X_{n-1})color{blue}{P(X_1,...,X_{n-1})}$,
                        $color{blue}{P(X_1,...,X_{n-1})}=P(X_{n-1}|X_1,...,X_{n-2})P(X_1,...,X_{n-2})$,

                        ...,
                        $P(X_1, X_2, X_3)=P(X_3|X_1, X_2)color{brown}{P(X_1, X_2)}$,
                        $color{brown}{P(X_1, X_2)}=P(X_2|X_1)P(X_1)$.



                        Belief network $G$ is built as follows: $X_n$ has no outlink, $X_{n-1}$ links to $X_n$, $X_{n-2}$ links to $X_{n-1}$ and $X_{n}$, and finally $X_1$ links to all $X_2$ to $X_n$.



                        Generally, there is no unique order for factorization, thus multiple networks can represent the same distribution. For example $P(A, B)$ can be factorized as $P(A|B)P(B)$ or $P(B|A)P(A)$ which are represented with two different Bayesian networks.



                        Every Bayesian network represents a sub-graph of $G$. Some directed links from $G$ are removed to introduce "independence assumptions between variables".



                        How about cyclic and undirected?



                        However, there is also Markov networks. Some distributions can be represented by Markov networks but not Bayesian networks [Example 3.6, Page 82, Probabilistic Graphical Models Principles and Techniques - Daphne Koller]. These networks use undirected lines to represent dependencies between random variables. A line between A and B represents [part of] a factor (a function).



                        For example, $phi_1(A, B)=A.B/2=B.A/2$ is a factor of factorization $P_{theta}(A, B, C, D) = frac{1}{Z(theta)}phi_1(A, B)phi_2(B, C, D)$. Note that link $(A, B)$ cannot be unambiguously assigned a direction, since both directions are justified for $phi_1(A, B)$. As an example for cyclic relations, factor $phi(B, C, D) = B.(C+D)$ is represented with a triangle among $B$, $C$, and $D$.



                        Generally speaking, cycles tend to complicate the learning and inference on Markov networks, so fewer cycles is favorable.






                        share|improve this answer








                        New contributor




                        Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                        Check out our Code of Conduct.






                        $endgroup$
















                          0












                          0








                          0





                          $begingroup$

                          A Belief network is "defined" to be a DAG. A better question would be "why a distribution needs to be represented by a Belief network, i.e. a DAG?"



                          Dependency relationships can be modeled with both directed-acyclic and undirected-cyclic representations. Therefore, a distribution may be represented in either ways.



                          Why acyclic and directed?



                          Every distribution that can be sequentially factorized into conditional probabilities as follows can be represented by a DAG:



                          $P(X_1,...,X_n)=P(X_n|X_1,...,X_{n-1})color{blue}{P(X_1,...,X_{n-1})}$,
                          $color{blue}{P(X_1,...,X_{n-1})}=P(X_{n-1}|X_1,...,X_{n-2})P(X_1,...,X_{n-2})$,

                          ...,
                          $P(X_1, X_2, X_3)=P(X_3|X_1, X_2)color{brown}{P(X_1, X_2)}$,
                          $color{brown}{P(X_1, X_2)}=P(X_2|X_1)P(X_1)$.



                          Belief network $G$ is built as follows: $X_n$ has no outlink, $X_{n-1}$ links to $X_n$, $X_{n-2}$ links to $X_{n-1}$ and $X_{n}$, and finally $X_1$ links to all $X_2$ to $X_n$.



                          Generally, there is no unique order for factorization, thus multiple networks can represent the same distribution. For example $P(A, B)$ can be factorized as $P(A|B)P(B)$ or $P(B|A)P(A)$ which are represented with two different Bayesian networks.



                          Every Bayesian network represents a sub-graph of $G$. Some directed links from $G$ are removed to introduce "independence assumptions between variables".



                          How about cyclic and undirected?



                          However, there is also Markov networks. Some distributions can be represented by Markov networks but not Bayesian networks [Example 3.6, Page 82, Probabilistic Graphical Models Principles and Techniques - Daphne Koller]. These networks use undirected lines to represent dependencies between random variables. A line between A and B represents [part of] a factor (a function).



                          For example, $phi_1(A, B)=A.B/2=B.A/2$ is a factor of factorization $P_{theta}(A, B, C, D) = frac{1}{Z(theta)}phi_1(A, B)phi_2(B, C, D)$. Note that link $(A, B)$ cannot be unambiguously assigned a direction, since both directions are justified for $phi_1(A, B)$. As an example for cyclic relations, factor $phi(B, C, D) = B.(C+D)$ is represented with a triangle among $B$, $C$, and $D$.



                          Generally speaking, cycles tend to complicate the learning and inference on Markov networks, so fewer cycles is favorable.






                          share|improve this answer








                          New contributor




                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          $endgroup$



                          A Belief network is "defined" to be a DAG. A better question would be "why a distribution needs to be represented by a Belief network, i.e. a DAG?"



                          Dependency relationships can be modeled with both directed-acyclic and undirected-cyclic representations. Therefore, a distribution may be represented in either ways.



                          Why acyclic and directed?



                          Every distribution that can be sequentially factorized into conditional probabilities as follows can be represented by a DAG:



                          $P(X_1,...,X_n)=P(X_n|X_1,...,X_{n-1})color{blue}{P(X_1,...,X_{n-1})}$,
                          $color{blue}{P(X_1,...,X_{n-1})}=P(X_{n-1}|X_1,...,X_{n-2})P(X_1,...,X_{n-2})$,

                          ...,
                          $P(X_1, X_2, X_3)=P(X_3|X_1, X_2)color{brown}{P(X_1, X_2)}$,
                          $color{brown}{P(X_1, X_2)}=P(X_2|X_1)P(X_1)$.



                          Belief network $G$ is built as follows: $X_n$ has no outlink, $X_{n-1}$ links to $X_n$, $X_{n-2}$ links to $X_{n-1}$ and $X_{n}$, and finally $X_1$ links to all $X_2$ to $X_n$.



                          Generally, there is no unique order for factorization, thus multiple networks can represent the same distribution. For example $P(A, B)$ can be factorized as $P(A|B)P(B)$ or $P(B|A)P(A)$ which are represented with two different Bayesian networks.



                          Every Bayesian network represents a sub-graph of $G$. Some directed links from $G$ are removed to introduce "independence assumptions between variables".



                          How about cyclic and undirected?



                          However, there is also Markov networks. Some distributions can be represented by Markov networks but not Bayesian networks [Example 3.6, Page 82, Probabilistic Graphical Models Principles and Techniques - Daphne Koller]. These networks use undirected lines to represent dependencies between random variables. A line between A and B represents [part of] a factor (a function).



                          For example, $phi_1(A, B)=A.B/2=B.A/2$ is a factor of factorization $P_{theta}(A, B, C, D) = frac{1}{Z(theta)}phi_1(A, B)phi_2(B, C, D)$. Note that link $(A, B)$ cannot be unambiguously assigned a direction, since both directions are justified for $phi_1(A, B)$. As an example for cyclic relations, factor $phi(B, C, D) = B.(C+D)$ is represented with a triangle among $B$, $C$, and $D$.



                          Generally speaking, cycles tend to complicate the learning and inference on Markov networks, so fewer cycles is favorable.







                          share|improve this answer








                          New contributor




                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          share|improve this answer



                          share|improve this answer






                          New contributor




                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.









                          answered 2 days ago









                          EsmailianEsmailian

                          3865




                          3865




                          New contributor




                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.





                          New contributor





                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          Esmailian is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






















                              silverscientist is a new contributor. Be nice, and check out our Code of Conduct.










                              draft saved

                              draft discarded


















                              silverscientist is a new contributor. Be nice, and check out our Code of Conduct.













                              silverscientist is a new contributor. Be nice, and check out our Code of Conduct.












                              silverscientist is a new contributor. Be nice, and check out our Code of Conduct.
















                              Thanks for contributing an answer to Data Science Stack Exchange!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              Use MathJax to format equations. MathJax reference.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46699%2fwhy-does-a-belief-network-need-to-be-represented-using-a-directed-acyclic-graph%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to label and detect the document text images

                              Vallis Paradisi

                              Tabula Rosettana