Page 297 - ARM 64 Bit Assembly Language

P. 297

286 Chapter 8

represents

−1.0000100101 × 2 11001−01111 =−1.0000100101 × 2 10 =−10000100101.0
=−1061 10 .

8.7.2 IEEE 754 single-precision

The single precision format provides a 23-bit mantissa, and an 8-bit exponent. This is enough
to represent a reasonably large range, with reasonable precision. This type can be stored in
32 bits, so it is relatively compact. At the time that the IEEE standards were deﬁned, most
machines used a 32-bit word, and were optimized for moving and processing data in 32-bit
quantities. For many applications this format represents a good trade-off between performance
and precision.

8.7.3 IEEE 754 double-precision

The double-precision format was designed to provide enough range and precision for most
scientiﬁc computing requirements. It provides a 10-bit exponent and a 53-bit mantissa. When
the IEEE 754 standard was introduced, this format was not supported by most hardware. That
has changed. Most modern ﬂoating point hardware is optimized for the IEEE 754 double-
precision standard, and most modern processors are designed to move 64-bit or larger quan-
tities. On modern ﬂoating-point hardware, this is the most efﬁcient representation. However
processing large arrays of double-precision data requires twice as much memory, and twice as
much memory bandwidth, as single-precision.

292 293 294 295 296 297 298 299 300 301 302