Page 379 - ARM 64 Bit Assembly Language
P. 379

Advanced SIMD instructions 369

                     10.7.6 Reciprocal step

                     These instructions are used to perform one Newton-Raphson step for improving the reciprocal
                     estimates:
                     frecps     Reciprocal Step, and
                     frsqrts    Reciprocal Square Root Step.

                     For each element in the vector, the following equation can be used to improve the estimates of
                     the reciprocals:

                                                     x n+1 = x n (2 − dx n ).
                     Where x n is the estimated reciprocal from the previous step, and d is the number for which
                     the reciprocal is desired. This equation converges to  1  if x 0 is obtained using vrecpe on d.
                                                                   d
                     The vrecps instruction computes
                                                       x   n+1  = 2 − dx n ,

                     so one additional multiplication is required to complete the update step. The initial estimate
                     x 0 must be obtained using the vrecpe instruction.
                     For each element in the vector, the following equation can be used to improve the estimates of
                     the reciprocals of the square roots:

                                                               3 − dx n 2
                                                      x n+1 = x n      .
                                                                  2
                     Where x n is the estimated reciprocal from the previous step, and d is the number for which
                                                                    1
                     the reciprocal is desired. This equation converges to √ if x 0 is obtained using vrsqrte on d.
                                                                     d
                     The vrsqrts instruction computes
                                                              3 − dx n
                                                       x   n+1  =    ,
                                                                 2
                     so two additional multiplications are required to complete the update step. The initial estimate
                     x 0 must be obtained using the vrsqrte instruction.


                     10.7.6.1 Syntax

                          f<op>        Vd.T, Vn.T, Vm.T
                          f<op>        Fd, Fn, Fm


                     •   <op> is either recps or rsqrts.
                     •   T must be 2s, 4s,or 2d.
                     •   F is s or d.
   374   375   376   377   378   379   380   381   382   383   384