Page 332 - ARM 64 Bit Assembly Language
P. 332
Floating point 321
7 {
8 int i,j;
9 point result;
10 for(i=0;i<3;i++)
11 result[i] = m[3][i];
12 for(i=0;i<3;i++)
13 {
14 for(j=0;j<3;j++)
15 result[i] += m[j][i] * p[j];
16 }
17 *p = result;
18 }
Write optimal AArch64 FP/NEON code to implement this function.
9.10. The function in the previous problem would typically be called multiple times to pro-
cess an array of points, as in the following function:
1 void xformall(matrix *m, point* p, int num_points)
2 {
3 int i;
4 for(i=0;i<num_points;i++)
5 xform(m,p+i);
6 }
This could be somewhat inefficient. Re-write this function in assembly, so that the
transformation of each point is done without resorting to a function call. Make your
code as efficient as possible.