Page 295 - ARM 64 Bit Assembly Language

P. 295

284 Chapter 8

Listing 8.10 Efﬁcient representation of a binimal.

1 typedef struct{
2 int sign:1;
3 int exponent:8;
4 int mantissa: 23;
5 }IEEEsingle;

described using bit ﬁelds in C, as described above. Many processors have hardware that is
speciﬁcally designed to perform arithmetic using the standard IEEE formatted data. The fol-
lowing sections highlight most of the IEEE deﬁned numerical deﬁnitions.
The IEEE standard speciﬁes the bitwise representation for numbers, and speciﬁes parameters
for how arithmetic is to be performed. The IEEE standard for numbers includes the possibility
of having numbers that cannot be easily represented. For example, any quantity that is greater
than the most positive representable value is represented as positive inﬁnity, and any quantity
that is less than the most negative representable value is represented as negative inﬁnity. There
are special bit patterns to encode these quantities. The programmer or hardware designer is
responsible for ensuring that their implementation conforms to the IEEE standards. The fol-
lowing sections describe some of the IEEE standard data formats.

8.7.1 IEEE 754 half-precision

The half-precision format gives a 16-bit encoding for fractional numbers with a small range
and low precision. There are situations where this format is adequate. If the computation is
being performed on a very small machine, then using this format may result in signiﬁcantly
better performance than could be attained using one of the larger IEEE formats. However, in
most situations, the programmer can achieve better performance and/or precision by using a
ﬁxed-point representation. The format is as follows:

• The Signiﬁcand (a.k.a. “Mantissa” or “Fractional Part”) is stored using a sign-magnitude
coding, with bit 15 being the sign bit.
• The exponent is an excess-15 number, i.e. the number stored is 15 greater than the actual
exponent.

290 291 292 293 294 295 296 297 298 299 300