Page 332 - Embedded Microprocessor Systems Real World Design
P. 332

Now we  can create a 16-bit floating-point number from our example value:
                   0       100000      010001 010
                   sign    exponent    mantissa, leading 1 implied

                The general steps for converting a decimal number  in the form xxx.yyy  to floating-point
                format are:
                  Convert xxx (digits to the left of the decimal point)  to binary  (call it aaaa). Convert yyy
                  (digits to right of decimal point) to fractional binary, call it bbb. Write as a fractional binary
                  number:

                  aaa.bbbb
                  Shift the number to the right, keeping track of the exponent, until there is a single 1 to
                  the left of the radix point:

                  a.aabbb (6bit example shown, works for any size number)
                  exponent = 2  (because we  shifted two positions right)
                  Drop the leading 1 and calculate the exponent using the bias of the exponent field. If the
                  number is positive, make the sign bit 0. Otherwise the sign bit is 1.
                   The IEEE has developed a standard for representation  of floating-point numbers. The
                IEEE format defines single and double precision values. The IEEE single-precision format
                uses  1 sign  bit, 8 exponent  bits, and 23 mantissa bits, for  a total of  32  bits. The double
                precision standard uses 64 bits: 1 sign bit, 11 exponent bits, and 52 fractional bits. The single
                precision exponent can range from -127  to +127, and the double precision exponent can
                range from -1023  to +1023.
                   Finally, what do we  do to indicate zero? Zero can’t be represented by  1.xxxx. The IEEE
                standard defines zero as being represented when the exponent and mantissa are both zero.
                The sign bit can be either.
                   The IEEE standard also reserves the maximum exponent value (FF for single precision,
                7FF for double precision) to indicate an overflow condition-numbers  that are either too
                small or too large to be represented.
























                Appendix B                                                           313
   327   328   329   330   331   332   333   334   335   336   337