Page 297 - Embedded Microprocessor Systems Real World Design
P. 297
troller on-chip. The i960 also has a local memory bus for accessing DRAM or flash
memory. The i960 VH has an internal 32-bit address space; the PCI bus can be
made part of this address space, or the VH address space can be independent of
the PCI bus. Integration of the PCI bus onto the chip provides a very high level of
performance on a standard interface.
Although you can use the i960 PCI bus interface to create a PCI card slot into
which you can plug standard PC peripheral cards, you can also implement a PCI
bus on a circuit board with no connectors at all. This lets you use ICs designed for
use on PCI bus cards on your embedded circuit board.
Cache Memory
One problem that occurs as processors get faster and faster is the bottleneck of
accessing memory. On-chip speeds inside the CPU always are faster than the speed
of external buses. For example, the PGstandard PCI bus at 66MHz usually is driven
by a CPU with a much faster internal clock. A 100MHz PCI typically is connected
to a 300MHz or faster CPU. In addition, 100MHz SDRAMs connect to 350 or
400MHz CPUs.
The reason for this is that the logic delays inside the CPU are more controllable
and more repeatable than those going off-chip. Also, signal paths inside the chip
are only tiny fractions of an inch, versus longer traces on a PC board. This affects
both the propagation delay and the transmission-line characteristics of the traces.
The bottom line is that a very fast CPU may be unable to execute instructions
at full speed because it is starved for data from a memory that cannot keep up.
One solution to this problem is the addition of cache memory. Cache memory is a
fast memory located close to the CPU and operating closer to CPU speeds. Cache
memory usually is implemented with very fast static RAM.
Cache memory is managed by a cache controller that fetches data from the main
memory and stores it in the cache. Cache memory works because most micro-
processor programs are repetitive in nature-the code loops around and around,
executing the same string of instructions for some time before moving on to some
other piece of code. When the CPU wants to execute code not in the cache, the
cache controller gets the code from main memory (DRAM, usually) and moves it
into the cache. Once in the cache, the code executes very quickly.
If cache is so fast, why not just make all the memory cache? The first reason
is cost-building all main memory out of the super-fast cache SRAM would make
the memory prohibitively expensive. Second, cache SRAM ICs are larger than
equivalent DRAM due to the larger cell size and added number of pins required.
Thus, making all main memory out of cache parts would make the memory array
physically larger, which would limit speed due to trace lengths.
278 Embedded Micr@rocessor Systems