Page 330 - ARM 64 Bit Assembly Language
P. 330
Floating point 319
Name Page Operation
fsub 309 Subtract
ldnp 301 Load Non-Temporal Pair
ldp 301 Load Pair
scvtf 306 Convert Signed Fixed Point to Float
scvtf 305 Convert Signed Integer to Float Using FPCR Rounding Mode
stnp 301 Store Non-Temporal Pair
stp 301 Store Pair
ucvtf 306 Convert Unsigned Fixed Point to Float
ucvtf 305 Convert Unsigned Integer to Float Using FPCR rounding mode
9.10 Chapter summary
The AArch64 FP/NEON coprocessor adds a great deal of power to the ARM architecture.
The FP/NEON register set can hold over twice the amount of data that can be held in the
AArch64 integer registers. The additional instructions allow the programmer to deal directly
with the most common IEEE 754 formats for floating point numbers. In the next chapter,
we will see that the ability to treat groups of registers as vectors adds a significant perfor-
mance improvement. The GCC compiler does not make good use of these advanced fea-
tures, which gives the assembly programmer a big advantage when high-performance code
is needed.
Exercises
9.1. How many registers does the FP/NEON coprocessor add to the AArch64 architecture?
9.2. What is the purpose of the AHP, DN, and FZ, RMODE, and FZ16 bits in the FPCR?
9.3. How are floating point parameters passed to subroutines? How is a pointer to a floating
point value (or array of values) passed to a subroutine?
9.4. Write the following C code in AArch64 assembly:
1 for (x = 0.0; x != 10.0; x += 0.1)
2 {
3 .
4 .
5 .
6 }
9.5. In the previous exercise, the C code contains a subtle bug.
• What is the bug?