Page 222 - ARM 64 Bit Assembly Language
P. 222

210 Chapter 7


                             Listing 7.5 AArch64 assembly code for division by constant 193.

                1         // The following code will calculate w2/193
                2         // It will leave the quotient in x0 and
                3         // the remainder in x1
                4         ldr     x3, =0x54E42524        // x3 = (1/193 << 38)
                5         smull   x4, w2, w3             //x*m
                6         asr     x0, x4, #38
                7         sub     x0, x0, x4, asr #63    // subtract sign of dividend
                8         // calculate remainder in x1
                9         mov     x1, #193               // x1 = divisor
                10        mul     x1, x1, x0             // multiply divisor by product
                11        sub     x1, x2, x1             // subtract that from numerator



                  Example 17 shows how to calculate m and n for division by 193. On the AArch64 processor,
                  division by a constant can be performed very efficiently. Listing 7.5 shows how division by
                  193 can be implemented using only a few lines of code. In the listing, the numbers are 32 bits
                  in length, so the constant m is much larger than in the example that was multiplied by hand,
                  but otherwise the method is the same.
                  If we wish to divide by 23 using 32 bits of precision, we compute the multiplier as

                             2 32+4−1      2 35
                        m =          + 1 =    + 1 = 1493901669.17 ≈ 1493901669 = 590B2165 16 .
                               23          23
                  That is 01011001000010110010000101100101 2 . Note that there are only 12 non-zero bits,
                  and the pattern 1011001 appears three times in the 32-bit multiplier. The multiply can be im-
                               24  6    4     3     0      13  6    4     3     0      2  6    4
                  plemented as 2 (2 x + 2 x + 2 x + 2 x) + 2 (2 x + 2 x + 2 x + 2 x) + 2 (2 x + 2 x +
                               0
                   3
                        0
                  2 x + 2 x) + 2 x. So the following code sequence can be used on processors that do not have
                  the multiply instruction:
                  Listing 7.6 AArch64 assembly code for division of a variable by a constant without using
                                                a multiply instruction.

                1         // The following code will calculate w2/23
                2         // It will leave the quotient in x0 and does
                3         // not calculate a remainder
                4         sxtw    x2, w2         // Sign extend w2
                5         mov     x0, x2         // Copy into x0
                6         // calculate 2^6x+2^4x+2^3x+2^0x
                7         add     x3, x2, x0, lsl #3
                8         add     x3, x3, x0, lsl #4
                9         add     x3, x3, x0, lsl #6
                10        // now perform three 64-bit shift-add operations
                11        lsl     x3, x3, #2
   217   218   219   220   221   222   223   224   225   226   227