Page 246 - Embedded Microprocessor Systems Real World Design
P. 246
cumulative error is not a problem or resynchronize each processor to the leading
edge of each block. This may require more sensors than otherwise would be
required for system operation.
Revisions With a multiprocessor system, it often is possible to change the
firmware for one processor without changing the others. Be sure this causes no
problems if some function works differently than before. For instance, a new
firmware revision might handle error messages from another processor with a
different priority than the original firmware. Or the maximum buffer size might
get changed in such a way that it is a problem only if certain errors occur. You may
need additional regression testing of the combined system when firmware is
changed.
It is not a bad idea to have a suite of tests that is run any time firmware changes
are made to any of the processors in the system. This would need to test all the
error conditions and all the communication paths, buffers, and types. Of course,
this type of error can creep into a single-processor system as well, but it is easier to
overlook in a multiprocessor system due to the isolation of the CPUs.
Error Handling Be sure all the processors handle errors consistently. In the
wooden block example, if a problem occurs, do not let one processor try to stop
everything while another tries to keep the conveyer going so everything falls off
the end.
Berserk Processors Where possible, handle the case of a berserk processor that
writes all through memory or a frozen processor that will not communicate at all.
Have timeouts on communication operations. You usually cannot operate normally,
but at least make all the moving/rotating mechanisms safe. In cases where you have
optional subsystems, the rest of the system may need to operate normally when
something in the optional part is not working.
Cumulative Time Errors When sending data or timing signals from one proces
sor to another, be aware that the clocks of the two processors will almost always drift
slightly. Over a long period of time, this can accumulate to a significant time error.
Say that two systems operate with crystals having a specified accuracy of .003 percent
(a typical value). These two systems both keep track of time in hours, minutes, and
seconds. If one crystal is exactly correct and the other one is off by the maximum
amount (.003 percent), the two systems will be different by 2.6 seconds at the end
of one day.
If your system depends on two or more processors remaining in synchronization,
communication between processors should include synchronization information.
Don’t depend on the clocks staying synchronized well enough that two processors
counting, say, 1 millisecond interrupt ticks, will stay together. You may have to send
Multiprocessor Systems 227