Page 292 - ARM 64 Bit Assembly Language
P. 292
Non-integral mathematics 281
59 bl fixed_cos
60 mov x1,#28
61 ldr x2,=fmta
62 bl printS
63
64 ldr x0,=newline
65 bl printf
66
67 add x27,x27,#1
68 cmp x27,#33
69 blt mloop
70
71 ldp x29,x30,[sp],#16
72 ldp x27,x28,[sp],#16
73 ret
74
75 //----------------------------------------------------------------
Table 8.4: Performance of sine function with various
implementations.
Optimization Implementation CPU seconds
None 32-bit Fixed Point Assembly 5.15
32-bit Fixed Point C 18.60
Single Precision floating point C 10.56
Double Precision floating point C 10.07
-O2 32-bit Fixed Point Assembly 4.73
32-bit Fixed Point C 5.38
Single Precision floating point C 9.36
Double Precision floating point C 9.23
8.6.4 Performance comparison
In some situations it can be very advantageous to use fixed point math. Many processors in-
tended for use in embedded systems do not have a hardware floating point unit available.
Table 8.4 shows the CPU time required for running a program to compute the sine function
on 100,000,000 random values, using various implementations of the sine function. In each
case, the program main() function was written in C. The only difference in the four imple-
mentations was the data type (which could be fixed point, IEEE single precision, or IEEE
double precision), and the sine function that was used. The times shown in the table include
only the amount of CPU time actually used in the sine function, and do not include the time
required for program startup, storage allocation, random number generation, printing results,
or program exit. The four implementations are as follows: