Page 292 - ARM 64 Bit Assembly Language
P. 292

Non-integral mathematics 281


                   59         bl     fixed_cos
                   60         mov    x1,#28
                   61         ldr    x2,=fmta
                   62         bl     printS
                   63
                   64         ldr    x0,=newline
                   65         bl     printf
                   66
                   67         add    x27,x27,#1
                   68         cmp    x27,#33
                   69         blt    mloop
                   70
                   71         ldp    x29,x30,[sp],#16
                   72         ldp    x27,x28,[sp],#16
                   73         ret
                   74
                   75         //----------------------------------------------------------------

                                        Table 8.4: Performance of sine function with various
                                                       implementations.
                                    Optimization  Implementation              CPU seconds
                                    None          32-bit Fixed Point Assembly  5.15
                                                  32-bit Fixed Point C        18.60
                                                  Single Precision floating point C  10.56
                                                  Double Precision floating point C  10.07
                                    -O2           32-bit Fixed Point Assembly  4.73
                                                  32-bit Fixed Point C        5.38
                                                  Single Precision floating point C  9.36
                                                  Double Precision floating point C  9.23



                     8.6.4 Performance comparison

                     In some situations it can be very advantageous to use fixed point math. Many processors in-
                     tended for use in embedded systems do not have a hardware floating point unit available.
                     Table 8.4 shows the CPU time required for running a program to compute the sine function
                     on 100,000,000 random values, using various implementations of the sine function. In each
                     case, the program main() function was written in C. The only difference in the four imple-
                     mentations was the data type (which could be fixed point, IEEE single precision, or IEEE
                     double precision), and the sine function that was used. The times shown in the table include
                     only the amount of CPU time actually used in the sine function, and do not include the time
                     required for program startup, storage allocation, random number generation, printing results,
                     or program exit. The four implementations are as follows:
   287   288   289   290   291   292   293   294   295   296   297