Page 246 - Embedded Microprocessor Systems Real World Design
P. 246

cumulative error is not a problem or resynchronize each processor to the leading
                edge  of  each  block.  This  may  require  more  sensors than  otherwise would  be
                required for system operation.

                Revisions  With  a  multiprocessor  system,  it  often  is  possible  to  change  the
                firmware for one processor without changing the others. Be  sure this causes no
                problems  if some  function  works  differently  than  before.  For  instance,  a  new
                firmware revision  might  handle  error  messages from  another  processor with  a
                different priority than the original firmware. Or the maximum buffer size might
                get changed in such a way that it is a problem only if certain errors occur. You may
                need  additional  regression  testing  of  the  combined  system  when  firmware  is
                changed.
                  It is not a bad idea to have a suite of tests that is run any time firmware changes
                are made to any of  the processors in the system. This would need  to test all the
                error conditions and all the communication paths, buffers, and types. Of course,
                this type of error can creep into a single-processor system as well, but it is easier to
                overlook in a multiprocessor system due to the isolation of the CPUs.

                Error  Handling  Be  sure  all  the  processors handle  errors  consistently. In  the
                wooden block example, if a problem occurs, do not let one processor try to stop
                everything while another tries to keep the conveyer going so everything falls off
                the end.

                Berserk Processors  Where possible, handle the case of a berserk processor that
                writes all through memory or a frozen processor that will not communicate at all.
                Have timeouts on communication operations. You usually cannot operate normally,
                but at least make all the moving/rotating  mechanisms safe. In cases where you have
                optional subsystems, the  rest of  the  system may  need  to operate normally when
                something in the optional part is not working.

                Cumulative Time Errors  When sending data or timing signals from one proces
                sor to another, be aware that the clocks of the two processors will almost always drift
                slightly. Over a long period of time, this can accumulate to a significant time error.
                Say that two systems operate with crystals having a specified accuracy of .003 percent
                (a typical value). These two systems both keep track of time in hours, minutes, and
                seconds. If one crystal is exactly correct and the other one is off by the maximum
                amount (.003 percent), the two systems will be different by 2.6 seconds at the end
                of one day.
                  If your system depends on two or more processors remaining in synchronization,
                communication  between processors should include synchronization information.
                Don’t depend on the clocks staying synchronized well enough that two processors
                counting, say, 1 millisecond interrupt ticks, will stay together. You may have to send


                Multiprocessor Systems                                               227
   241   242   243   244   245   246   247   248   249   250   251