Page 401 - Mechatronics for Safety, Security and Dependability in a New Era
P. 401

Ch78-I044963.fm  Page 385  Monday, August 7, 2006  11:30 AM
                            Monday, August 7,2006
                      Page 385
                                           11:30 AM
            Ch78-I044963.fm
                                                                                          385
                                                                                          385
                      AN EFFECTIVE       STATE-SPACE      CONSTRUCTION        METHOD
                         FOR REINFORCEMENT           LEARNING      OF MULTI-LINK
                                             MOBILE ROBOTS


                                                          1
                                                  :
                                         M. Nunobiki  K. Okuda  and  S. Maeda 2
                            1
                              Department  of Mechanical  and  System Engineering, University  of Hyogo,
                                         2167 Shosha, Himeji,  Hyogo, JAPAN
                               Department of Quality Assurance, Shin Caterpillar Mitsubishi LTD.
                                     1106-4 Shimizu Uozumi, Akashi, Hyogo, JAPAN



                  ABSTRACT

                  One of the problems in reinforcement  learning with real robots is to need a large number of trials. This
                  paper  proposes  a  reinforcement  learning  that  uses  fuzzy  ART  for  segmentation  of  state-space.
                  Whenever  fuzzy  ART encounters a new  situation, it generates a new category node to the  state-space.
                  We  proposed  generating  methods  of  new  category  nodes  that  inherit  the  state-value  and  the  policy
                  from  a  similar  node.  Proposed  methods  were  estimated  from  simulations  of  a  two-link  manipulator
                  and  a  multi-link  mobile  robot.  It was  confirmed  that  the  proposed  method  was  able  to  increase  the
                  learning speed and reduce the size of state-space.


                  KEYWORDS

                  Reinforcement  learning, Actor-critic, State-space construction, Fuzzy ART, Inheritance of state-value,
                  Manipulator, Multi-link mobile robot, Action  acquisition



                  INTRODUCTION
                  We have developed inchworm-type mobile robots to search for  life  in collapsed buildings. While these
                  robots had neither  legs nor wheels, they were able to advance  by using vertically  undulatory  motion  of
                  whole body (Takita et al.). These robots demonstrated high mobility (Nunobiki  et al.). However, it was
                  difficult  to  generate  suitable  walking  motions  because  it  was  difficult  for  human  to  understand  the
                  motions  of  the  multi-link  robot  intuitively.  Therefore,  reinforcement  learning  (Suttun  and  Barto.)  is
                  expected  for  trajectory  generation  of these  robots.  This  paper  deals  with  actor-critic  learning  method
                  (Barto  and  Suttun.). Although the actor-critic methods require minimal  computation  in order to  select
                  an action from  a continuous-valued  action, the performances  are insufficient  to apply to the real robots
                  yet  (Morimoto  and  Doya.).  A  grid-like  representation  of  the  state-space  was  insufficient  to  the
   396   397   398   399   400   401   402   403   404   405   406