Page 332 - ARM 64 Bit Assembly Language
P. 332

Floating point 321


                          7   {
                          8     int i,j;
                          9     point result;
                         10     for(i=0;i<3;i++)
                         11       result[i] = m[3][i];
                         12     for(i=0;i<3;i++)
                         13     {
                         14       for(j=0;j<3;j++)
                         15         result[i] += m[j][i] * p[j];
                         16     }
                         17     *p = result;
                         18   }
                           Write optimal AArch64 FP/NEON code to implement this function.
                     9.10. The function in the previous problem would typically be called multiple times to pro-
                           cess an array of points, as in the following function:

                          1  void xformall(matrix *m, point* p, int num_points)
                          2   {
                          3     int i;
                          4     for(i=0;i<num_points;i++)
                          5       xform(m,p+i);
                          6   }
                           This could be somewhat inefficient. Re-write this function in assembly, so that the
                           transformation of each point is done without resorting to a function call. Make your
                           code as efficient as possible.
   327   328   329   330   331   332   333   334   335   336   337