Page 142 - Introduction to Autonomous Mobile Robots
P. 142

127
                           Perception
                           that the odd rows are captured first, then afterward the even rows are captured. When such
                           a camera is used in dynamic environments, for example, on a moving robot, then adjacent
                           rows show the dynamic scene at two different time points, differing by up to one-thirtieth
                           of a second. The result is an artificial blurring due to motion and not optical defocus. By
                           comparing only even-numbered rows we avoid this interlacing side effect.
                             Recall that the three images are each taken with a camera using a different focus posi-
                           tion. Based on the focusing position, we call each image close, medium or far. A 5 x 3
                           coarse depth map of the scene is constructed quickly by simply comparing the sharpness
                           values of each of the three corresponding regions. Thus, the depth map assigns only two
                           bits of depth information to each region using the values close, medium, and far. The crit-
                           ical step is to adjust the focus positions of all three cameras so that flat ground in front of
                           the obstacle results in medium readings in one row of the depth map. Then, unexpected
                           readings of either close or far will indicate convex and concave obstacles respectively,
                           enabling basic obstacle avoidance in the vicinity of objects on the ground as well as drop-
                           offs into the ground.
                             Although sufficient for obstacle avoidance, the above depth from focus algorithm pre-
                           sents unsatisfyingly coarse range information. The alternative is depth from defocus, the
                           most desirable of the focus-based vision techniques.
                             Depth from defocus methods take as input two or more images of the same scene, taken
                           with different, known camera geometry. Given the images and the camera geometry set-
                           tings, the goal is to recover the depth information of the 3D scene represented by the
                           images. We begin by deriving the relationship between the actual scene properties (irradi-
                           ance and depth), camera geometry settings, and the image g that is formed at the image
                           plane.
                             The focused image f x y,(  )   of a scene is defined as follows. Consider a pinhole aperture
                                                              p
                           (L =  0 ) in lieu of the lens. For every point   at position  xy,(  )   on the image plane, draw
                           a line through the pinhole aperture to the corresponding, visible point P in the actual scene.
                           We define f x y,(  )   as the irradiance (or light intensity) at   due to the light from  . Intu-
                                                                                           P
                                                                        p
                           itively, f x y,(  )   represents the intensity image of the scene perfectly in focus.
                             The point spread function hx y x y R,(  g  g ,,,  xy )   is defined as the amount of irradiance
                                                                ,
                                                           f
                                                             f
                           from point   in the scene (corresponding to  x y,(  f  f )   in the focused image   that contributes
                                                                                    f
                                    P
                           to point  x y,(  g  g )   in the observed, defocused image  . Note that the point spread function
                                                                    g
                           depends not only upon the source,  x y,(  f  f ) , and the target,  x y,(  g  g )  , but also on  , the blur
                                                                                         R
                                      R
                                                                            P
                           circle radius.  , in turn, depends upon the distance from point   to the lens, as can be seen
                           by studying equations (4.19) and (4.20).
                             Given the assumption that the blur circle is homogeneous in intensity, we can define h
                           as follows:
   137   138   139   140   141   142   143   144   145   146   147