Page 179 - A Practical Guide from Design Planning to Manufacturing
P. 179
152 Chapter Five
A longer global branch history vector allows detection of more complex
patterns, or a history vector can be maintained for each branch rather
than a single vector recording the behavior of all branches. There are
many possible variations and their success varies greatly from one appli-
cation to another. This makes it difficult for a microarchitecture designer
to choose between them. One solution is to implement two different
prediction algorithms in hardware and allow the processor to dynami-
cally select between them. 10
In addition to whatever hardware is supporting each prediction
scheme, another cache of 2-bit counters is maintained tracking which
scheme is more successful. If method A is correct and method B is
incorrect, the counter is decremented. If method A is incorrect and
method B is correct, the counter is incremented. If both methods are
correct or incorrect, no change is made. The branch prediction is then
made using method A for branches whose counter is 0 or 1 and using
method B for branches whose counter is 2 or 3. These hybrid methods
achieve the highest prediction accuracies at the cost of the most hard-
ware support.
For branches that are predicted taken, the next instruction should be
fetched from the target address of the branch. For relative branches, this
means the target offset must be added to the instruction pointer. To allow
fetching to begin sooner, many branch prediction caches store not only
the branch history but also the branch target address. When this is done
the branch prediction cache is called a branch target buffer (BTB). Indirect
branches are a special problem because their target varies during run
time. The branch may be correctly predicted but with the wrong target
address. To deal with this, some BTBs allow indirect branches to have
multiple entries with different target addresses. Past branch behavior is
used to pick which of the target addresses is most likely.
Register renaming
Because branches control the flow of the program, pipeline breaks cre-
ated by branches are called control dependencies. Pipeline breaks caused
by one instruction having to wait for the result of another are called
data dependencies. True data dependencies are caused by the passage
of data from one instruction to another, but in addition there are false data
dependencies created by the reuse of registers in programs. These false
dependencies are eliminated and performance improved by register
renaming.
10
Hilgendorf et al., “Evaluation of branch-prediction methods.”