jumping ======= There are four different jumping techniques possible with the two non-conditional jump instructions in the RISC-V Base ISA: - short jumps: relative (``6f/jal``) - far jumps: absolute (``37/lui`` + ``67/jalr``) - far jumps: relative (``17/auipc`` + ``67/jalr``) - “zero page” jumps: (``67/jalr 0/rs/x0``) short jumps: relative (6f/jal) ------------------------------ .. CAUTION:: will probably need to explain "0th bit assumed 0" up here already far jumps: absolute (37/lui + 67/jalr) -------------------------------------- .. CAUTION:: will need to explain slices up here already far jumps: relative (17/auipc + 67/jalr) ---------------------------------------- Consider the following scenario, where execution begins at ``main``/``0x13000`` and after performing some other work (two noops) we want to jump to ``target``/``0x02000``:: == code (0x02080) target: # we want to jump here # infinite loop 6f/jal 0/rd/x0 target/off20 == code (0x13000) main: 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop # now we are at address 0x13008 # and we want to jump up to "target" The offset is ``0x02080 - 0x13008`` = ``-0x10f88``, which is outside of the 20-bit range of the short-jump instruction ``6f/jal``. Since instructions are 32-bit wide, we cannot fit the whole jump offset, which is already 32-bit wide on its own, into an instruction. Instead we have to split it up into two parts, and load it using two instructions:: MSB LSB SHHHHHHHHHHHHHHHHHHHH____________ upper 20 bits S____________________LLLLLLLLLLL0 sign bit + lower 12 bits (0th bit assumed 0) The two parts are both signed numbers that when added together sum up to the complete offset we want to jump. In the case given above we can split the number like so: ``-0x10f88`` = ``(-0x10000) + (-0xf88)``. The first instruction is ``17/auipc`` (**A**\ dd **U**\ pper **I**\ mmediate (to) **PC**), which takes the upper 20 bits of our offset and adds them to the program counter register (pc). By adding to PC, ``17/auipc`` already turns our relative offset literal into an absolute address. The result of ``17/auipc`` is stored in a register of our choice, we will use ``t0`` (register ``x5``) here. We also have to trim off the lower 12 bits of our first part ``-0x10000``, to get it to fit into the 20-bit immediate argument, ``-0x10000 >> 12`` becomes ``-0x10``:: 17/auipc 5/rd/t0 -0x10/off20 This will store ``pc + (-0x13 << 12)`` = ``0x13008 - 0x10000`` = ``0x3008`` into the register. At this point the remaining offset is ``0x2080 - 0x3008 = -0xf88`` - exactly the second part of the offset we have yet to add. We can add this offset to the register value and jump at once with the ``67/jalr`` instruction, which takes a register and an immediate value, adds them, and jumps to the resulting address. We will use the prepared offset in our chosen register ``t0`` and our remaining offset ``-0xf88`` as the immediate value. However the instruction only has a 12-bit immediate field, whereas we need to pass our lower 12 bits *and* a sign bit, a total of 13 bits! Similar to ``6f/jal``, ``67/jalr`` accounts for this by having us drop off the lowest bit of our offset, which is always going to be 0 anyway, since instructions have to be aligned to 16 bit. Therefore our immediate becomes ``-0xf88 >> 1 = -0x7c4``:: 67/jalr 0/rd/x0 0/func 5/rs/t0 -0x7c4/off12 .. NOTE:: the ``rd`` output register is used to save the return address when calling, functions, it is set to ``x0`` here since we are not using it. The ``0/func`` particle is a fixed part of the ``67/jalr`` instruction that is always ``0``. .. NOTE:: for convenience, SubV also allows passing a 13-bit offset and will trim off the excess lowest bit (after validating that it is in fact zero), so we could have also written the following:: 67/jalr 0/rd/x0 0/func 5/rs/t0 -0xf88/off13 This will now add our register value to our literal (shifted back up to the 13-bit range we intended) and jump to the resulting address: ``t0 + (-0x7c4 << 1) = 0x3008 - 0xf88 = 0x2080`` - exactly where we intended to go :) label slicing ~~~~~~~~~~~~~ Obviously doing all this arithmetic when the addresses change is not something we want to do, we would rather use labels. The SubV syntax affords us some convenience here, so lets work up from the most to the least explicit form: :: 17/auipc 5/rd/t0 target/off32/[31:12] 67/jalr 0/rd/x0 0/func 5/rs/t0 target+4/off32/[1:11:1] In this most explicit variant, both offsets are specified as 32-bit (``off32``). This is important because SubV will verify that the given offset fits in the given field width. If we had used ``off12`` for the ``67/jalr`` immediate, the offset value would not fit into that range and an error would occur. Instead, we use the slice syntax ``[H:L]``/``[S:H:L]`` to specify which bits we want to extract from each of the 32-bit values to form our shorter immediates. The ``H`` and ``L`` indices specify the range of bits to slice out. Both limits are inclusive (the highest bit taken is ``H`` and the lowest is ``L``), so the slice ``[15:8]`` has a size of ``1 + H - L = 8`` and includes the bits with indices ``15, 14, …, 9, 8`` of the original value. When ``S`` is specified as ``1``, the sign bit (the highest bit) of the original value is also copied as the highest bit of the resulting value. This takes up an extra bit in the result value, so the size also increases by one in this case; ``[1:14:8]`` has a size of 8 (``S + 1 + H - L = 1 + 1 + 14 - 8``) and includes the bits ``31, 14, 13, …, 9, 8`` for a 31-bit input. When ``S`` is not specified, it defaults to ``0`` (no sign bit). Using this slice syntax, we can slice the labels as required: - ``target/off32/[31:12]`` is a 20-bit slice - ``target+4/off32/[1:11:1]`` is a 12-bit slice including the original sign bit, the low bits 11 through 1, dropping the lowest bit. label offsets ~~~~~~~~~~~~~ You may have noticed the ``+4`` literal offset that is added to the label address. Since our “jump” consists of two instructions, the offset to the target label is different when calculated relative to the first instruction and to the second instruction respectively. However when building our two offset values (the top 20 and the lower 12 bits), we need to use the offset relative to the ``17/auipc`` instruction both times. This is because the final address is calculated as ``(hi + pc) + lo``, where the ``(hi + pc)`` part is calculated by ``17/auipc``, and the ``(…) + lo`` part is done in ``67/jalr``. Using the example above, if we left off the ``+4`` offset, our two parts would be:: # 17/auipc at addr 0x13008 target/off32/[31:12] = (0x02080 - 0x13008)[31:12] = (-0x10f88)[31:12] = -0x10 # 67/jalr at addr 0x1300c target/off32/[1:11:1] = (0x02080 - 0x1300c)[1:11:1] = (-0x10f84)[1:11:1] = -0xf84 >> 1 If we walk through the instructions, we now get ``(0x13008 - (0x10 << 12)) - 0xf84`` as our jump destination, which resolves to ``0x2084`` - four bytes too far! The four bytes extra are the address difference between the ``17/auipc`` and ``67/jalr`` instructions. By adding ``4`` to the label address in the ``67/jalr`` immediate, we can cancel out this difference:: # 67/jalr at addr 0x1300c target+4/off32/[1:11:1] = (0x02080 + 4 - 0x1300c)[1:11:1] = (-0x10f88)[1:11:1] = -0xf88 >> 1 .. NOTE:: While this is probably very rare, you could have an offset other than ``+4`` if there are other instructions between ``17/auipc`` and ``67/jalr``:: 17/auipc 5/rd/t0 target/off32/[31:12] 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 67/jalr 0/rd/x0 0/func 5/rs/t0 target+12/off32/[1:11:1] Here there are three 32-bit instructions between the start of ``17/auipc`` and ``67/jalr``, hence a ``3 * 4 = 12`` byte offset. "zero page" jumps: (67/jalr 0/rs/x0) ------------------------------------ .. CAUTION:: Need to test and understand this properly first