Page 149 - A Practical Guide from Design Planning to Manufacturing
P. 149

122   Chapter Four

        them much earlier in a program than when their results will be used
        can give a large performance benefit. However, moving a load before a
        branch may cause exceptions to occur, which should never have hap-
        pened. A branch may be specifically checking to see if a particular load
        would cause a memory protection violation or other exception before exe-
        cuting it. If the compiler moves the load before the branch, the program
        will not execute in the way it should on an in-order machine.
          Another problem with moving loads is possible dependencies on stores.
        Many load and store instructions use registers as part of their address,
        making it impossible for the compiler to determine whether they might
        access the same memory location. In theory, any load could depend upon
        any store. If the compiler cannot move loads before the earlier branch
        or stores, then very little reordering is going to be possible.
          The EPIC architecture addresses this by supporting speculative loads.
        The speculative load executes like a normal load, but if it would trigger
        an exception, it merely marks a flag bit in its destination register. Also
        the address and destination register of the speculative load are stored
        in a table. Each store checks the table to see if any speculative loads have
        read from the same address the store is writing to. If so, the destina-
        tion register is marked as invalid.
          When the program flow reaches the point of the original load, a load
        check instruction is executed instead. If the speculative load caused an
        exception, then the exception is executed at this point. If the program
        flow never reaches the load check, then the exception will correctly
        never be called. The load check also sees if the load result has been
        marked invalid by a later store. If so, the load is executed again. Most
        of the time, the load check will pass. The correct value will already have
        been loaded into a register, ready for use by other instructions.
          The EPIC architecture has even more potential performance advan-
        tages over older CISC architecture than RISC architectures do, but it
        has had difficulty achieving widespread acceptance. The same desire for
        compatibility that has kept RISC architecture from becoming more
        widely used has limited the use of EPIC architecture processors. Initially
        one important selling point for EPIC was support of 64-bit registers
        and virtual addresses, but recent extensions to the x86 architecture
        have duplicated much of this functionality. The future of EPIC remains
        uncertain, but it shows how much Moore’s law has changed the kinds
        of architectures that are possible.


        Recent x86 extensions
        Since 2003, increasing power densities have slowed the rate of frequency
        improvement in x86 implementations. This had led to both Intel and
        AMD placing more emphasis on architecture as a way of providing
   144   145   146   147   148   149   150   151   152   153   154