Page 231 - Introduction to Microcontrollers Architecture, Programming, and Interfacing of The Motorola 68HC12
P. 231

208                                         Chapter 7 Arithmetic Operations








        The rounding process for addition of numbers with opposite signs (e.g., subtraction) is
        exactly like that above except that the round byte must be included in the subtraction,
        and renormalization may be necessary after the significands are subtracted. In this
        renormalization step, several shifts left of the significand may be required where each
        shift requires a bit b for the least significant bit of the significand. It may be obtained
        from the round byte as shown below. (The sticky bit may also be replaced by zero in the
        process pictured without altering the final result. However, at least one round bit is
        required.) After renormalization, the rounding process is identical to (16). As an example,

                                   20     * 1.1111 .. . 1
                               -   2-23 * 1.1110 . . . 0

        becomes
                                   2°    * 1.0000 .. . 00
                               - 20 * 0.0000 . . . OirillOOOQO)
                                   2°    * 0.1111 . . . 10(00100000)
                                                     l
        which, after renormalization and rounding, becomes 2~  * 1.1... 10 0. Subroutines
        for floating-point addition and multiplication are given in Hiware's C and C++ libraries.
        To illustrate the principles without an undue amount of detail, the subroutines are given
        only for normalized floating-point numbers. Underflow is handled by flushing the result
        to zero and setting an underflow flag, and overflow is handled by setting an overflow flag
        and returning the largest possible magnitude with the correct sign. These subroutines
        conform to the IEEE standard but illustrate the basic algorithms, including rounding. The
        procedure for addition is summarized in Figure 7.20, where one should note that the
        significands are added as signed-magnitude numbers.
            One other issue with floating-point numbers is conversion. For example, how does
                                                         4
        one convert the decimal floating-point number 3.45786* 10  into a binary floating-point
        number with the IEEE format? One possibility is to have a table of binary floating-point
        numbers, one for each power of ten in the range of interest. One can then compute the
        expression
                              4           3                     1
                       3 * 10  + 4 * 10  + . . . + 6 * 10-

        using the floating-point add and floating-point multiply subroutines. One difficulty with
        this approach is that accuracy is lost because of the number of floating point multiplies
        and adds that are used. For example, for eight decimal digits in the decimal significand,
        there are eight floating-point multiplies and seven floating-point adds used in the
                                                                4
        conversion process. To get around this, one could write 3.45786 * 10  as .345786 * 10 5
                                                     5
        and multiply the binary floating-point equivalent of 10  (obtained again from a table) by
        the binary floating-point equivalent of .345786. This, of course, would take only one
        floating-point multiply and a conversion of the decimal fraction to a binary floating-
        point number.
   226   227   228   229   230   231   232   233   234   235   236