Tree @master (Download .tar.gz)
JUMPING.md @master — view markup · raw · history · blame
There are four different jumping techniques possible with the two non-conditional jump instructions in the RISC-V Base ISA:
- short jumps: relative (
- far jumps: absolute (
- far jumps: relative (
- "zero page" jumps: (
short jumps: relative (6f/jal)
@TODO: will probably need to explain "0th bit assumed 0" up here already
far jumps: absolute (37/lui + 67/jalr)
@TODO: will need to explain slices up here already
far jumps: relative (17/auipc + 67/jalr)
Consider the following scenario, where execution begins at
and after performing some other work (two noops) we want to jump to
== code (0x02080) target: # we want to jump here # infinite loop 6f/jal 0/rd/x0 target/off20 == code (0x13000) main: 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop # now we are at address 0x13008 # and we want to jump up to "target"
The offset is
0x02080 - 0x13008 =
-0x10f88, which is outside of the 20-bit
range of the short-jump instruction
Since instructions are 32-bit wide, we cannot fit the whole jump offset, which is already 32-bit wide on its own, into an instruction. Instead we have to split it up into two parts, and load it using two instructions:
MSB LSB SHHHHHHHHHHHHHHHHHHHH____________ upper 20 bits S____________________LLLLLLLLLLL0 sign bit + lower 12 bits (0th bit assumed 0)
The two parts are both signed numbers that when added together sum up to the
complete offset we want to jump. In the case given above we can split the number
(-0x10000) + (-0xf88).
The first instruction is
17/auipc (Add Upper Immediate (to) PC),
which takes the upper 20 bits of our offset and adds them to the program counter
register (pc). By adding to PC,
17/auipc already turns our relative offset
literal into an absolute address. The result of
17/auipc is stored in a register
of our choice, we will use
We also have to trim off the lower 12 bits of our first part
-0x10000, to get it
to fit into the 20-bit immediate argument,
-0x10000 >> 12 becomes
17/auipc 5/rd/t0 -0x10/off20
This will store
pc + (-0x13 << 12) =
0x13008 - 0x10000 =
0x3008 into the register.
At this point the remaining offset is
0x2080 - 0x3008 = -0xf88 - exactly the second
part of the offset we have yet to add.
We can add this offset to the register value and jump at once with the
instruction, which takes a register and an immediate value, adds them, and jumps to
the resulting address.
We will use the prepared offset in our chosen register
t0 and our remaining offset
-0xf88 as the immediate value. However the instruction only has a 12-bit immediate
field, whereas we need to pass our lower 12 bits and a sign bit, a total of 13 bits!
67/jalr accounts for this by having us drop off the lowest
bit of our offset, which is always going to be 0 anyway, since instructions have
to be aligned to 16 bit. Therefore our immediate becomes
-0xf88 >> 1 = -0x7c4:
67/jalr 0/rd/x0 0/func 5/rs/t0 -0x7c4/off12
rdoutput register is used to save the return address when calling, functions, it is set to
x0here since we are not using it. > The
0/funcparticle is a fixed part of the
67/jalrinstruction that is always
note: for convenience, SubV also allows passing a 13-bit offset and will trim off the excess lowest bit (after validating that it is in fact zero), so we could have also written the following:
67/jalr 0/rd/x0 0/func 5/rs/t0 -0xf88/off13
This will now add our register value to our literal (shifted back up to the
13-bit range we intended) and jump to the resulting address:
t0 + (-0x7c4 << 1) = 0x3008 - 0xf88 = 0x2080 - exactly where we intended to go :)
Obviously doing all this arithmetic when the addresses change is not something we want to do, we would rather use labels. The SubV syntax affords us some convenience here, so lets work up from the most to the least explicit form:
17/auipc 5/rd/t0 target/off32/[31:12] 67/jalr 0/rd/x0 0/func 5/rs/t0 target+4/off32/[1:11:1]
In this most explicit variant, both offsets are specified as 32-bit (
This is important because SubV will verify that the given offset fits in the
given field width. If we had used
off12 for the
67/jalr immediate, the offset
value would not fit into that range and an error would occur.
Instead, we use the slice syntax
[S:H:L] to specify which bits we want
to extract from each of the 32-bit values to form our shorter immediates.
L indices specify the range of bits to slice out. Both limits are
inclusive (the highest bit taken is
H and the lowest is
L), so the slice
[15:8] has a size of
1 + H - L = 8 and includes the bits with indices
15, 14, …, 9, 8 of the original value.
S is specified as
1, the sign bit (the highest bit) of the original value
is also copied as the highest bit of the resulting value. This takes up an extra
bit in the result value, so the size also increases by one in this case;
[1:14:8] has a size of 8 (
S + 1 + H - L = 1 + 1 + 14 - 8) and includes the bits
31, 14, 13, …, 9, 8 for a 31-bit input.
S is not specified, it defaults to
0 (no sign bit).
Using this slice syntax, we can slice the labels as required:
target/off32/[31:12]is a 20-bit slice
target+4/off32/[1:11:1]is a 12-bit slice including the original sign bit, the low bits 11 through 1, dropping the lowest bit.
You may have noticed the
+4 literal offset that is added to the label address.
Since our "jump" consists of two instructions, the offset to the target label is
different when calculated relative to the first instruction and to the second
However when building our two offset values (the top 20 and the lower 12 bits),
we need to use the offset relative to the
17/auipc instruction both times.
This is because the final address is calculated as
(hi + pc) + lo, where the
(hi + pc) part is calculated by
17/auipc, and the
(…) + lo part is done
Using the example above, if we left off the
+4 offset, our two parts would be:
# 17/auipc at addr 0x13008 target/off32/[31:12] = (0x02080 - 0x13008)[31:12] = (-0x10f88)[31:12] = -0x10 # 67/jalr at addr 0x1300c target/off32/[1:11:1] = (0x02080 - 0x1300c)[1:11:1] = (-0x10f84)[1:11:1] = -0xf84 >> 1
If we walk through the instructions, we now get
(0x13008 - (0x10 << 12)) - 0xf84
as our jump destination, which resolves to
0x2084 - four bytes too far!
The four bytes extra are the address difference between the
instructions. By adding
4 to the label address in the
67/jalr immediate, we can
cancel out this difference:
# 67/jalr at addr 0x1300c target+4/off32/[1:11:1] = (0x02080 + 4 - 0x1300c)[1:11:1] = (-0x10f88)[1:11:1] = -0xf88 >> 1
NOTE: While this is probably very rare, you could have an offset other than
+4if there are other instructions between
17/auipc 5/rd/t0 target/off32/[31:12] 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop 67/jalr 0/rd/x0 0/func 5/rs/t0 target+12/off32/[1:11:1]
Here there are three 32-bit instructions between the start of
67/jalr, hence a
3 * 4 = 12byte offset.
"zero page" jumps: (67/jalr 0/rs/x0)
@TODO: Need to test and understand this properly first