Page 191 - ARM 64 Bit Assembly Language
P. 191

178 Chapter 6


               126        bgt     finale
               127        ldr     x0, [x19]              // (*left)->count > pivot
               128        ldr     w1, [x0, #wln_count]
               129        cmp     x1, x12
               130        ble     loopb
               131        add     x19, x19, #8           // increment left
               132        b       loopa
               133  loopb:  cmp   x19, x20               // while left < right &&
               134        bgt     finale
               135        ldr     x2, [x20]              // (*right)->count < pivot
               136        ldr     w3, [x2, #wln_count]
               137        cmp     x3, x12
               138        bge     cmp
               139        sub     x20, x20, #8           // decrement right
               140        b       loopb
               141  cmp:  cmp     x19, x20               // if( left <= right )
               142        bgt     finale
               143        str     x0, [x20], #-8         // swap pointers and
               144        str     x2, [x19], #8          // change indices
               145        b       loopa
               146  finale: mov   x0, x21                // quicksort array from
               147        mov     x1, x20                // first to current right
               148        bl      wl_quicksort
               149        mov     x0, x19                // quicksort array from
               150        mov     x1, x22                // current left to last)
               151        bl      wl_quicksort
               152        ldp     x19, x20, [sp, #16]
               153        ldp     x21, x22, [sp, #32]
               154        ldp     x29, x30, [sp], #48
               155  wl_quicksort_exit:
               156        ret
               157        .size   wl_quicksort,(. - wl_quicksort)

                  The tree-based implementation gets most of its speed improvement through using two
                                                    2
                  O(N logN) algorithms to replace O(N ) algorithms. These examples show how a small part
                  of a program can be implemented in assembly language, and how to access C data structures
                  from assembly language. The functions could just as easily have been written in C rather than
                  assembly, without greatly affecting performance. Later chapters will show examples where
                  the assembly implementation does have significantly better performance than the C imple-
                  mentation.


                  6.3 Ethics case study: Therac-25

                  The Therac-25 was a device designed for radiation treatment of cancer. It was produced by
                  Atomic Energy of Canada Limited (AECL), which had previously produced the Therac-6 and
   186   187   188   189   190   191   192   193   194   195   196