Page 158 - ARM 64 Bit Assembly Language
P. 158

144 Chapter 5


                                   Listing 5.36 Initializing a structured data type in C.

                1  #include <string.h>
                2
                3  struct student {
                4   char first_name[30];
                5   char last_name[30];
                6   unsigned char class;
                7   int grade;
                8  };
                9
                10  int main(void)
                11  {
                12  struct student newstudent;  /* allocate struct on the stack */
                13  strcpy(newstudent.first_name, "Sam");
                14  strcpy(newstudent.last_name, "Smith");
                15  newstudent.class = 2;
                16  newstudent.grade = 88;
                           .
                17         . .
                18  return 0;
                19  }




                  three values is represented using an unsigned eight bit integer. Image processing software of-
                  ten adds a fourth value, α, specifying the transparency of each pixel.

                  Listing 5.38 shows how an array of pixels can be allocated and initialized in C. The listing
                  uses the malloc() function from the C standard library to allocate storage for the pixels
                  from the heap (See Section 1.4). Note that the code uses the sizeof() function to deter-
                  mine how many bytes of memory are consumed by a single pixel, then multiplies that by the
                  width and height of the image. Listing 5.39 shows the equivalent code in AArch64 assem-
                  bly.

                  Note that the code in Listing 5.39 is far from optimal. It can be greatly improved by combin-
                  ing the two loops into one loop. This will remove the need for the multiply on line 31 and the
                  addition and will simplify the code structure. An additional improvement would be to incre-
                  ment the single loop counter by three on each loop iteration, which will make it very easy to
                  calculate the pointer for each pixel. Listing 5.40 shows the AArch64 assembly implementa-
                  tion with these optimizations.

                  Although the implementation shown in Listing 5.40 is more efficient than the previous ver-
                  sion, there are several more improvements that can be made. If we consider that the goal of
                  the code is to allocate some number of bytes and initialize them all to zero, then the code can
                  be written more efficiently. Rather than using three separate store instructions to set 3 bytes
   153   154   155   156   157   158   159   160   161   162   163