Page 91 - Artificial Intelligence for the Internet of Everything
P. 91

Active Inference in Multiagent Systems  77


              Section 4.3.2, the environment is fully observable, hence we omit the obser-
              vation notation from the rest of the exposition. We define the adaptive
              behavior for a team of agents based on the free-energy principle using three
              processes:
              •  Perception will find the beliefs b representing the probability distribution
                 q(djm);
              •  Control will produce the next decision d by sampling the state space to
                 minimize surprise; and
              •  Reorganization will adapt the structure m among agents in terms of deci-
                 sion decompositions and the agent network.
              Formally, we define a generative probability distribution for a decision var-
              iable d as:

                                        1       1  Y P
                                         e C dðÞ       φ d j ,
                                       Z        Z   j¼1
                               p dj mÞ ffi      ¼         j
                                ð
                           c j (d )
              where φ j (d j )¼e  j . We can then write the variational free energy as a func-
              tion of beliefs b and of the team structure m:
                                                  ފ Hq dj bފ:
                              ð
                             Fb, mÞ ¼ E q   lnp dj mð½  ½  ð
                 Then minimizing the variational free energy F(b,m) with respect to
              probability functions q(djb) becomes an exact procedure for bounding sur-
              prise and recovering p(djm). Exact minimization, however, is intractable for
              general forms of q(djm) due to the curse of dimensionality.
                 When the generative probability is factorizable, as we described in
              Section 4.2.4, generalized belief propagation is used to find the marginal
              probability distributions (Yedidia et al., 2005) that form the basis for gener-
              ating decision points stochastically. This method, however, requires a diffi-
              cult decision decomposition step and incurs high computational cost in an
              optimal message aggregation step.
                 Two features can help us address these challenges. First, we note that
              maximizing a global team decision cost function is equivalent to maximizing
              the joint decision probability function. This process, known as maximum a-
              posteriori (MAP) estimation, requires obtaining max-marginal probability
              values rather than marginal probabilities. Second, instead of the exact com-
              putation of belief distributions, we use an approximate solution produced by
              the standard belief propagation algorithm (Yedidia et al., 2005), based on the
              Bethe approximation of the free-energy function. We use the max-product
              algorithm to reduce a space of the distributions to analyze, lowering the
              computational complexity. The max-product belief propagation algorithm
   86   87   88   89   90   91   92   93   94   95   96