Page 342 - ARM 64 Bit Assembly Language
P. 342

332 Chapter 10

                  •  <n> must be one of 1, 2, 3,or 4.
                  •  <list> specifies the list of registers. There are four list formats:
                     1. {Vt.T}
                     2. {Vt.T, V(t+1).T} or {Vt.T-V(t+1).T}
                     3. {Vt.T, V(t+1).T, V(t+2).T} or {Vt.T-V(t+2).T}
                     4. {Vt.T, V(t+1).T, V(t+2).T, V(t+3).T} or {Vt.T-V(t+3).T}
                     The registers must be consecutive. Register 0 is consecutive to register 31.
                  •  T must be 16b, 8b, 8h, 4h, 4s, 2s,or 2d.If <n> is 1, then T can be 1d.
                  •  Xn is the AARCH64 register containing the base address.
                  •  Xm is the AARCH64 register containing an offset.
                  •  If a register or immediate offset is given, then the base register, Xn, will be post-
                     incremented.
                  •  The post-increment immediate offset, if present, must be 1, 2, 3, 4, 6, 8, 12, 16, 24, or 32,
                     depending on the number of elements transferred and the size specified by T.

                  10.2.3.2 Operations

                   Name      Effect                               Description
                   ld<n>r    tmp ← Xn                             Load one structure into all lanes of
                             incr ← byteSize(T)                   one or more registers.
                             for V ∈ regs(<list>) do
                               for 0 ≤ x< nLanes(T) do
                                 V[x]← Mem[tmp]
                               end for
                               tmp ← tmp + incr
                             end for
                             if #imm is present then
                               Xn ← Xn + imm
                             else
                               if Rm is specified then
                                 Xn ← Xn + Xm
                               end if
                             end if

                  10.2.3.3 Examples

                1     // Load eight copies of an rgb struct into
                2     // v0(red),v1(green),and v2(blue)
                3     ld3r    {v0.8b-v2.8b},[x0]  // load 8 copies
   337   338   339   340   341   342   343   344   345   346   347