git.s-ol.nu subv / 52a9033
add JUMPING.md s-ol 28 days ago
1 changed file(s) with 180 addition(s) and 0 deletion(s). Raw diff Collapse all Expand all
0 # jumping
1 There are four different jumping techniques possible with
2 the two non-conditional jump instructions in the RISC-V Base ISA:
3
4 - short jumps: relative (`6f/jal`)
5 - far jumps: absolute (`37/lui` + `67/jalr`)
6 - far jumps: relative (`17/auipc` + `67/jalr`)
7 - "zero page" jumps: (`67/jalr 0/rs/x0`)
8
9 ## short jumps: relative (6f/jal)
10
11 @TODO: will probably need to explain "0th bit assumed 0" up here already
12
13 ## far jumps: absolute (37/lui + 67/jalr)
14
15 @TODO: will need to explain slices up here already
16
17 ## far jumps: relative (17/auipc + 67/jalr)
18 Consider the following scenario, where execution begins at `main`/`0x13000`
19 and after performing some other work (two noops) we want to jump to
20 `target`/`0x02000`:
21
22 == code (0x02080)
23 target: # we want to jump here
24 # infinite loop
25 6f/jal 0/rd/x0 target/off20
26
27 == code (0x13000)
28 main:
29 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop
30 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop
31
32 # now we are at address 0x13008
33 # and we want to jump up to "target"
34
35 The offset is `0x02080 - 0x13008` = `-0x10f88`, which is outside of the 20-bit
36 range of the short-jump instruction `6f/jal`.
37
38 Since instructions are 32-bit wide, we cannot fit the whole jump offset, which
39 is already 32-bit wide on its own, into an instruction. Instead we have to split
40 it up into two parts, and load it using two instructions:
41
42 MSB LSB
43 SHHHHHHHHHHHHHHHHHHHH____________ upper 20 bits
44 S____________________LLLLLLLLLLL0 sign bit + lower 12 bits (0th bit assumed 0)
45
46 The two parts are both signed numbers that when added together sum up to the
47 complete offset we want to jump. In the case given above we can split the number
48 like so: `-0x10f88` = `(-0x10000) + (-0xf88)`.
49
50 The first instruction is `17/auipc` (**A**dd **U**pper **I**mmediate (to) **PC**),
51 which takes the upper 20 bits of our offset and adds them to the program counter
52 register (pc). By adding to PC, `17/auipc` already turns our relative offset
53 literal into an absolute address. The result of `17/auipc` is stored in a register
54 of our choice, we will use `t0` (register `x5`) here.
55
56 We also have to trim off the lower 12 bits of our first part `-0x10000`, to get it
57 to fit into the 20-bit immediate argument, `-0x10000 >> 12` becomes `-0x10`.
58
59 17/auipc 5/rd/t0 -0x10/off20
60
61 This will store `pc + (-0x13 << 12)` = `0x13008 - 0x10000` = `0x3008` into the register.
62 At this point the remaining offset is `0x2080 - 0x3008 = -0xf88` - exactly the second
63 part of the offset we have yet to add.
64
65 We can add this offset to the register value and jump at once with the `67/jalr`
66 instruction, which takes a register and an immediate value, adds them, and jumps to
67 the resulting address.
68
69 We will use the prepared offset in our chosen register `t0` and our remaining offset
70 `-0xf88` as the immediate value. However the instruction only has a 12-bit immediate
71 field, whereas we need to pass our lower 12 bits *and* a sign bit, a total of 13 bits!
72 Similar to `6f/jal`, `67/jalr` accounts for this by having us drop off the lowest
73 bit of our offset, which is always going to be 0 anyway, since instructions have
74 to be aligned to 16 bit. Therefore our immediate becomes `-0xf88 >> 1 = -0x7c4`:
75
76 67/jalr 0/rd/x0 0/func 5/rs/t0 -0x7c4/off12
77
78 > **note:** the `rd` output register is used to save the return address when calling,
79 > functions, it is set to `x0` here since we are not using it. > The `0/func`
80 > particle is a fixed part of the `67/jalr` instruction that is always `0`.
81
82
83 > **note:** for convenience, SubV also allows passing a 13-bit offset and will trim
84 > off the excess lowest bit (after validating that it is in fact zero), so we could
85 > have also written the following:
86 >
87 > 67/jalr 0/rd/x0 0/func 5/rs/t0 -0xf88/off13
88
89 This will now add our register value to our literal (shifted back up to the
90 13-bit range we intended) and jump to the resulting address:
91 `t0 + (-0x7c4 << 1) = 0x3008 - 0xf88 = 0x2080` - exactly where we intended to go :)
92
93 ### label slicing
94 Obviously doing all this arithmetic when the addresses change is not something we
95 want to do, we would rather use labels. The SubV syntax affords us some
96 convenience here, so lets work up from the most to the least explicit form:
97
98 17/auipc 5/rd/t0 target/off32/[31:12]
99 67/jalr 0/rd/x0 0/func 5/rs/t0 target+4/off32/[1:11:1]
100
101 In this most explicit variant, both offsets are specified as 32-bit (`off32`).
102 This is important because SubV will verify that the given offset fits in the
103 given field width. If we had used `off12` for the `67/jalr` immediate, the offset
104 value would not fit into that range and an error would occur.
105
106 Instead, we use the slice syntax `[H:L]`/`[S:H:L]` to specify which bits we want
107 to extract from each of the 32-bit values to form our shorter immediates.
108
109 The `H` and `L` indices specify the range of bits to slice out. Both limits are
110 inclusive (the highest bit taken is `H` and the lowest is `L`), so the slice
111 `[15:8]` has a size of `1 + H - L = 8` and includes the bits with indices
112 `15, 14, …, 9, 8` of the original value.
113
114 When `S` is specified as `1`, the sign bit (the highest bit) of the original value
115 is also copied as the highest bit of the resulting value. This takes up an extra
116 bit in the result value, so the size also increases by one in this case;
117 `[1:14:8]` has a size of 8 (`S + 1 + H - L = 1 + 1 + 14 - 8`) and includes the bits
118 `31, 14, 13, …, 9, 8` for a 31-bit input.
119
120 When `S` is not specified, it defaults to `0` (no sign bit).
121
122 Using this slice syntax, we can slice the labels as required:
123
124 - `target/off32/[31:12]` is a 20-bit slice
125 - `target+4/off32/[1:11:1]` is a 12-bit slice including the original sign bit,
126 the low bits 11 through 1, dropping the lowest bit.
127
128 ### label offsets
129 You may have noticed the `+4` literal offset that is added to the label address.
130 Since our "jump" consists of two instructions, the offset to the target label is
131 different when calculated relative to the first instruction and to the second
132 instruction respectively.
133
134 However when building our two offset values (the top 20 and the lower 12 bits),
135 we need to use the offset relative to the `17/auipc` instruction both times.
136 This is because the final address is calculated as `(hi + pc) + lo`, where the
137 `(hi + pc)` part is calculated by `17/auipc`, and the `(…) + lo` part is done
138 in `67/jalr`.
139
140 Using the example above, if we left off the `+4` offset, our two parts would be:
141
142 # 17/auipc at addr 0x13008
143 target/off32/[31:12]
144 = (0x02080 - 0x13008)[31:12]
145 = (-0x10f88)[31:12]
146 = -0x10
147
148 # 67/jalr at addr 0x1300c
149 target/off32/[1:11:1]
150 = (0x02080 - 0x1300c)[1:11:1]
151 = (-0x10f84)[1:11:1]
152 = -0xf84 >> 1
153
154 If we walk through the instructions, we now get `(0x13008 - (0x10 << 12)) - 0xf84`
155 as our jump destination, which resolves to `0x2084` - four bytes too far!
156 The four bytes extra are the address difference between the `17/auipc` and `67/jalr`
157 instructions. By adding `4` to the label address in the `67/jalr` immediate, we can
158 cancel out this difference:
159
160 # 67/jalr at addr 0x1300c
161 target+4/off32/[1:11:1]
162 = (0x02080 + 4 - 0x1300c)[1:11:1]
163 = (-0x10f88)[1:11:1]
164 = -0xf88 >> 1
165
166 > **NOTE**: While this is probably very rare, you could have an offset other than
167 > `+4` if there are other instructions between `17/auipc` and `67/jalr`:
168 >
169 > 17/auipc 5/rd/t0 target/off32/[31:12]
170 > 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop
171 > 13/opi 0/subop/add 0/rd/x0 0/rs/x0 0/imm12 # nop
172 > 67/jalr 0/rd/x0 0/func 5/rs/t0 target+12/off32/[1:11:1]
173 >
174 > Here there are three 32-bit instructions between the start of `17/auipc` and
175 > `67/jalr`, hence a `3 * 4 = 12` byte offset.
176
177 ## "zero page" jumps: (67/jalr 0/rs/x0)
178
179 @TODO: Need to test and understand this properly first