The RISC-V architecture constrains the addressing of positions in the address space. There is no single instruction that can refer to an arbitrary memory position using a literal as its argument. Rather, instructions exist that, when combined together, can then be used to refer to a memory position via its literal. And, when not, other data structures are used to help the code to address the memory space. The coding conventions governing their use are known as code models.
However, some code models can’t access the whole address space. The linker may raise an error if it cannot adjust the instructions to access the target address in the current code model.
The medium low code model, or medlow
, allows the code to address the whole RV32
address space or the lower 2 GiB and highest 2 GiB of the RV64 address space
(0xFFFFFFFF7FFFF800
~ 0xFFFFFFFFFFFFFFFF
and 0x0
~ 0x000000007FFFF7FF
).
By using the lui
and load / store instructions, when referring to an object, or
addi
, when calculating an address literal, for example,
a 32-bit address literal can be produced.
The following instructions show how to load a value, store a value, or calculate
an address in the medlow
code model.
# Load value from a symbol
lui a0, %hi(symbol)
lw a0, %lo(symbol)(a0)
# Store value to a symbol
lui a0, %hi(symbol)
sw a1, %lo(symbol)(a0)
# Calculate address
lui a0, %hi(symbol)
addi a0, a0, %lo(symbol)
Note
|
The ranges on RV64 are not 0x0 ~ 0x000000007FFFFFFF and
0xFFFFFFFF80000000 ~ 0xFFFFFFFFFFFFFFFF due to RISC-V’s sign-extension of
immediates; the following code fragments show where the ranges come from:
|
# Largest postive number:
lui a0, 0x7ffff # a0 = 0x7ffff000
addi a0, 0x7ff # a0 = a0 + 2047 = 0x000000007FFFF7FF
# Smallest negative number:
lui a0, 0x80000 # a0 = 0xffffffff80000000
addi a0, a0, -0x800 # a0 = a0 + -2048 = 0xFFFFFFFF7FFFF800
The medium any code model, or medany
, allows the code to address the range
between -2 GiB and +2 GiB from its position. By using auipc
and load / store instructions, when referring to an object, or
addi
, when calculating an address literal, for example,
a signed 32-bit offset, relative to the value of the pc
register,
can be produced.
As a special edge-case, undefined weak symbols must still be supported, whose addresses will be 0 and may be out of range depending on the address at which the code is linked. Any references to possibly-undefined weak symbols should be made indirectly through the GOT as is used for position-independent code. Not doing so is deprecated and a future version of this specification will require using the GOT, not just advise.
Note
|
This is not yet a requirement as existing toolchains predating this part of the specification do not adhere to this, and without improvements to linker relaxation support doing so would regress performance and code size. |
The following instructions show how to load a value, store a value, or calculate an address in the medany code model.
# Load value from a symbol
.Ltmp0: auipc a0, %pcrel_hi(symbol)
lw a0, %pcrel_lo(.Ltmp0)(a0)
# Store value to a symbol
.Ltmp1: auipc a0, %pcrel_hi(symbol)
sw a1, %pcrel_lo(.Ltmp1)(a0)
# Calculate address
.Ltmp2: auipc a0, %pcrel_hi(symbol)
addi a0, a0, %pcrel_lo(.Ltmp2)
Note
|
Although the generated code is technically position independent, it’s not suitable for ELF shared libraries due to differing symbol interposition rules; for that, please use the medium position independent code model below. |
This model is similar to the medium any code model, but uses the global offset table (GOT) for non-local symbol addresses.
# Load value from a local symbol
.Ltmp0: auipc a0, %pcrel_hi(symbol)
lw a0, %pcrel_lo(.Ltmp0)(a0)
# Store value to a local symbol
.Ltmp1: auipc a0, %pcrel_hi(symbol)
sw a1, %pcrel_lo(.Ltmp1)(a0)
# Calculate address of a local symbol
.Ltmp2: auipc a0, %pcrel_hi(symbol)
addi a0, a0, %pcrel_lo(.Ltmp2)
# Calculate address of non-local symbol
.Ltmp3: auipc a0, %got_pcrel_hi(symbol)
l[w|d] a0, a0, %pcrel_lo(.Ltmp3)
Any functions that use registers in a way that is incompatible with
the calling convention of the ABI in use must be annotated with
STO_RISCV_VARIANT_CC
, as defined in Symbol Table.
Note
|
Vector registers have a variable size depending on the hardware
implementation and can be quite large. Saving/restoring all these vector
arguments in a run-time linker’s lazy resolver would use a large amount of
stack space and hurt performance. STO_RISCV_VARIANT_CC attribute will require
the run-time linker to resolve the symbol directly to prevent saving/restoring
any vector registers.
|
C++ name mangling for RISC-V follows the Itanium C++ ABI [itanium-cxx-abi]; plus mangling for RISC-V vector data types and vector mask types, which are defined in the following section.
See the "Type encodings" section in Itanium C++ ABI
for more detail on how to mangle types. Note that __bf16
is mangled in the
same way as std::bfloat16_t
.
The vector data types and vector mask types, as defined in the section
[Vector type sizes and alignments], are treated as vendor-extended types in
the Itanium C++ ABI [itanium-cxx-abi]. These mangled name for
these types is "u"<len>"rvv_"<type-name>
. Specifically,
prefixing the type name with rvv_
, which is prefixed by
a decimal string indicating its length, which is prefixed by "u".
For example:
void foo(vint8m1_t x);
is mangled as
_Z3foou15__rvv_vint8m1_t
mangled-name = "u" len "__rvv_" type-name
len = nonzero *DIGIT
nonzero = "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9"
type-name = identifier-nondigit *identifier-char
identifier-nondigit = ALPHA / "_"
identifier-char = identifier-nondigit / "_"
The ELF object file format for RISC-V follows the Generic System V Application Binary Interface [gabi] ("gABI"); this specification only describes RISC-V-specific definitions.
The section below lists the defined RISC-V-specific values for several ELF header fields; any fields not listed in this section have no RISC-V-specific values.
- e_ident
-
- EI_CLASS
-
Specifies the base ISA, either RV32 or RV64. Linking RV32 and RV64 code together is not supported.
ELFCLASS64 ELF-64 Object File
ELFCLASS32 ELF-32 Object File
- EI_DATA
-
Specifies the endianness; either big-endian or little-endian. Linking big-endian and little-endian code together is not supported.
ELFDATA2LSB Little-endian Object File
ELFDATA2MSB Big-endian Object File
- e_machine
-
Identifies the machine this ELF file targets. Always contains EM_RISCV (243) for RISC-V ELF files.
- e_flags
-
Describes the format of this ELF file. These flags are used by the linker to disallow linking ELF files with incompatible ABIs together, Layout of e_flags shows the layout of e_flags, and flag details are listed below.
Table 1. Layout of e_flags Bit 0 Bits 1 - 2 Bit 3 Bit 4 Bits 5 - 23 Bits 24 - 31 RVC
Float ABI
RVE
TSO
Reserved
Non-standard extensions
- EF_RISCV_RVC (0x0001)
-
This bit is set when the binary targets the C ABI, which allows instructions to be aligned to 16-bit boundaries (the base RV32 and RV64 ISAs only allow 32-bit instruction alignment). When linking objects which specify EF_RISCV_RVC, the linker is permitted to use RVC instructions such as C.JAL in the linker relaxation process.
- EF_RISCV_FLOAT_ABI_SOFT (0x0000)
- EF_RISCV_FLOAT_ABI_SINGLE (0x0002)
- EF_RISCV_FLOAT_ABI_DOUBLE (0x0004)
- EF_RISCV_FLOAT_ABI_QUAD (0x0006)
-
These flags identify the floating point ABI in use for this ELF file. They store the largest floating-point type that ends up in registers as part of the ABI (but do not control if code generation is allowed to use floating-point internally). The rule is that if you have a floating-point type in a register, then you also have all smaller floating-point types in registers. For example _DOUBLE would store "float" and "double" values in F registers, but would not store "long double" values in F registers. If none of the float ABI flags are set, the object is taken to use the soft-float ABI.
- EF_RISCV_FLOAT_ABI (0x0006)
-
This macro is used as a mask to test for one of the above floating-point ABIs, e.g.,
(e_flags & EF_RISCV_FLOAT_ABI) == EF_RISCV_FLOAT_ABI_DOUBLE
.
- EF_RISCV_RVE (0x0008)
-
This bit is set when the binary targets the E ABI.
- EF_RISCV_TSO (0x0010)
-
This bit is set when the binary requires the RVTSO memory consistency model.
Until such a time that the Reserved bits (0x00ffffe0) are allocated by future versions of this specification, they shall not be set by standard software. Non-standard extensions are free to use bits 24-31 for any purpose. This may conflict with other non-standard extensions.
NoteThere is no provision for compatibility between conflicting uses of the e_flags bits reserved for non-standard extensions, and many standard RISC-V tools will ignore them. Do not use them unless you control both the toolchain and the operating system, and the ABI differences are so significant they cannot be done with a .RISCV.attributes tag nor an ELF note, such as using a different syscall ABI. ==== Policy for Merge Objects With Different File Headers
This section describe the behavior when the inputs files come with different file headers.
e_ident
ande_machine
should have exact same value otherwise linker should raise an error.e_flags
has different different policy for different fields:- RVC
-
Input file could have different values for the RVC field; the linker should set this field into EF_RISCV_RVC if any of the input objects has been set.
- Float ABI
-
Linker should report errors if object files of different value for float ABI field.
- RVE
-
Linker should report errors if object files of different value for RVE field.
- TSO
-
Input files can have different values for the TSO field; the linker should set this field if any of the input objects have the TSO field set.
NoteThe static linker may ignore the compatibility checks if all fields in the e_flags
are zero and all sections in the input file are non-executable sections.
- st_other
-
The lower 2 bits are used to specify a symbol’s visibility. The remaining 6 bits have no defined meaning in the ELF gABI. We use the highest bit to mark functions that do not follow the standard calling convention for the ABI in use.
The defined processor-specific
st_other
flags are listed in RISC-V-specificst_other
flags.Table 2. RISC-V-specific st_other
flagsName Mask STO_RISCV_VARIANT_CC
0x80
See Dynamic Linking for the meaning of
STO_RISCV_VARIANT_CC
.
__global_pointer$
must be exported in the dynamic symbol table of dynamically-linked
executables if there are any GP-relative accesses present in the executable.
RISC-V is a classical RISC architecture that has densely packed non-word sized instruction immediate values. While the linker can make relocations on arbitrary memory locations, many of the RISC-V relocations are designed for use with specific instructions or instruction sequences. RISC-V has several instruction specific encodings for PC-Relative address loading, jumps, branches and the RVC compressed instruction set.
The purpose of this section is to describe the RISC-V specific instruction sequences with their associated relocations in addition to the general purpose machine word sized relocations that are used for symbol addresses in the Global Offset Table or DWARF meta data.
Relocation types provides details of the RISC-V ELF relocations; the meaning of each column is given below:
- Enum
-
The number of the relocation, encoded in the r_info field
- ELF Reloc Type
-
The name of the relocation, omitting the prefix of
R_RISCV_
. - Type
-
Whether the relocation is a static or dynamic relocation:
-
A static relocation relocates a location in a relocatable file, processed by a static linker.
-
A dynamic relocation relocates a location in an executable or shared object, processed by a run-time linker.
-
Both
: Some relocation types are used by both static relocations and dynamic relocations.
-
- Field
-
Describes the set of bits affected by this relocation; see Field Symbols for the definitions of the individual types
- Calculation
-
Formula for how to resolve the relocation value; definitions of the symbols can be found in Calculation Symbols
- Description
-
Additional information about the relocation
Enum | ELF Reloc Type | Type | Field / Calculation | Description |
---|---|---|---|---|
0 |
NONE |
None |
||
1 |
32 |
Both |
word32 |
32-bit relocation |
S + A |
||||
2 |
64 |
Both |
word64 |
64-bit relocation |
S + A |
||||
3 |
RELATIVE |
Dynamic |
wordclass |
Adjust a link address (A) to its load address (B + A) |
B + A |
||||
4 |
COPY |
Dynamic |
Must be in executable; not allowed in shared library |
|
5 |
JUMP_SLOT |
Dynamic |
wordclass |
Indicates the symbol associated with a PLT entry |
S |
||||
6 |
TLS_DTPMOD32 |
Dynamic |
word32 |
|
TLSMODULE |
||||
7 |
TLS_DTPMOD64 |
Dynamic |
word64 |
|
TLSMODULE |
||||
8 |
TLS_DTPREL32 |
Dynamic |
word32 |
|
S + A - TLS_DTV_OFFSET |
||||
9 |
TLS_DTPREL64 |
Dynamic |
word64 |
|
S + A - TLS_DTV_OFFSET |
||||
10 |
TLS_TPREL32 |
Dynamic |
word32 |
|
S + A + TLSOFFSET |
||||
11 |
TLS_TPREL64 |
Dynamic |
word64 |
|
S + A + TLSOFFSET |
||||
12 |
TLSDESC |
Dynamic |
See TLS Descriptors |
|
TLSDESC(S+A) |
||||
16 |
BRANCH |
Static |
B-Type |
12-bit PC-relative branch offset |
S + A - P |
||||
17 |
JAL |
Static |
J-Type |
20-bit PC-relative jump offset |
S + A - P |
||||
18 |
CALL |
Static |
U+I-Type |
Deprecated, please use CALL_PLT instead 32-bit PC-relative function call, macros |
S + A - P |
||||
19 |
CALL_PLT |
Static |
U+I-Type |
32-bit PC-relative function call, macros |
S + A - P |
||||
20 |
GOT_HI20 |
Static |
U-Type |
High 20 bits of 32-bit PC-relative GOT access, |
G + GOT + A - P |
||||
21 |
TLS_GOT_HI20 |
Static |
U-Type |
High 20 bits of 32-bit PC-relative TLS IE GOT access, macro |
22 |
TLS_GD_HI20 |
Static |
U-Type |
High 20 bits of 32-bit PC-relative TLS GD GOT reference, macro |
23 |
PCREL_HI20 |
Static |
U-Type |
High 20 bits of 32-bit PC-relative reference, |
S + A - P |
||||
24 |
PCREL_LO12_I |
Static |
I-type |
Low 12 bits of a 32-bit PC-relative, |
S - P |
||||
25 |
PCREL_LO12_S |
Static |
S-Type |
Low 12 bits of a 32-bit PC-relative, |
S - P |
||||
26 |
HI20 |
Static |
U-Type |
High 20 bits of 32-bit absolute address, |
S + A |
||||
27 |
LO12_I |
Static |
I-Type |
Low 12 bits of 32-bit absolute address, |
S + A |
||||
28 |
LO12_S |
Static |
S-Type |
Low 12 bits of 32-bit absolute address, |
S + A |
||||
29 |
TPREL_HI20 |
Static |
U-Type |
High 20 bits of TLS LE thread pointer offset, |
30 |
TPREL_LO12_I |
Static |
I-Type |
Low 12 bits of TLS LE thread pointer offset, |
31 |
TPREL_LO12_S |
Static |
S-Type |
Low 12 bits of TLS LE thread pointer offset, |
32 |
TPREL_ADD |
Static |
TLS LE thread pointer usage, |
|
33 |
ADD8 |
Static |
word8 |
8-bit label addition |
V + S + A |
||||
34 |
ADD16 |
Static |
word16 |
16-bit label addition |
V + S + A |
||||
35 |
ADD32 |
Static |
word32 |
32-bit label addition |
V + S + A |
||||
36 |
ADD64 |
Static |
word64 |
64-bit label addition |
V + S + A |
||||
37 |
SUB8 |
Static |
word8 |
8-bit label subtraction |
V - S - A |
||||
38 |
SUB16 |
Static |
word16 |
16-bit label subtraction |
V - S - A |
||||
39 |
SUB32 |
Static |
word32 |
32-bit label subtraction |
V - S - A |
||||
40 |
SUB64 |
Static |
word64 |
64-bit label subtraction |
V - S - A |
||||
41 |
GOT32_PCREL |
Static |
word32 |
32-bit difference between the GOT entry for a symbol and the current location |
G + GOT + A - P |
||||
42 |
Reserved |
- |
Reserved for future standard use |
|
43 |
ALIGN |
Static |
Alignment statement. The addend indicates the number of bytes occupied by |
|
44 |
RVC_BRANCH |
Static |
CB-Type |
8-bit PC-relative branch offset |
S + A - P |
||||
45 |
RVC_JUMP |
Static |
CJ-Type |
11-bit PC-relative jump offset |
S + A - P |
||||
46 |
Reserved |
- |
Reserved for future standard use |
|
47 |
GPREL_LO12_I |
Static |
I-type |
Low 12 bits of a 32-bit GP-relative address, |
S + A - GP |
||||
48 |
GPREL_LO12_S |
Static |
S-Type |
Low 12 bits of a 32-bit GP-relative address, |
S + A - GP |
||||
49 |
GPREL_HI20 |
Static |
U-Type |
High 20 bits of a 32-bit GP-relative address, |
S + A - GP |
||||
50 |
Reserved |
- |
Reserved for future standard use |
|
51 |
RELAX |
Static |
Instruction can be relaxed, paired with a normal relocation at the same address |
|
52 |
SUB6 |
Static |
word6 |
Local label subtraction |
V - S - A |
||||
53 |
SET6 |
Static |
word6 |
Local label assignment |
S + A |
||||
54 |
SET8 |
Static |
word8 |
Local label assignment |
S + A |
||||
55 |
SET16 |
Static |
word16 |
Local label assignment |
S + A |
||||
56 |
SET32 |
Static |
word32 |
Local label assignment |
S + A |
||||
57 |
32_PCREL |
Static |
word32 |
32-bit PC relative |
S + A - P |
||||
58 |
IRELATIVE |
Dynamic |
wordclass |
Relocation against a non-preemptible ifunc symbol |
|
||||
59 |
PLT32 |
Static |
word32 |
32-bit relative offset to a function or its PLT entry |
S + A - P |
||||
60 |
SET_ULEB128 |
Static |
ULEB128 |
Must be placed immediately before a SUB_ULEB128 with the same offset. Local label assignment *note |
S + A |
||||
61 |
SUB_ULEB128 |
Static |
ULEB128 |
Must be placed immediately after a SET_ULEB128 with the same offset. Local label subtraction *note |
V - S - A |
||||
62 |
TLSDESC_HI20 |
Static |
U-Type |
High 20 bits of a 32-bit PC-relative offset into a TLS descriptor entry, |
S + A - P |
||||
63 |
TLSDESC_LOAD_LO12 |
Static |
I-Type |
Low 12 bits of a 32-bit PC-relative offset into a TLS descriptor entry, |
S - P |
||||
64 |
TLSDESC_ADD_LO12 |
Static |
I-Type |
Low 12 bits of a 32-bit PC-relative offset into a TLS descriptor entry, |
S - P |
||||
65 |
TLSDESC_CALL |
Static |
Annotate call to TLS descriptor resolver function, |
|
66-190 |
Reserved |
- |
Reserved for future standard use |
|
191 |
VENDOR |
Static |
Paired with a vendor-specific relocation and must be placed immediately before it, indicates which vendor owns the relocation. |
|
192-255 |
Reserved |
- |
Reserved for nonstandard ABI extensions |
|
Nonstandard extensions are free to use relocation numbers 192-255 for any
purpose. These vendor-specific relocations must be preceded by a
R_RISCV_VENDOR
relocation against a vendor ID symbol.
Where possible, tools should present relocation as their vendor-specific
relocation types, otherwise a generic name of R_RISCV_CUSTOM<enum value>
must
be shown. Data contained in paired RISCV_VENDOR
can be used to select the
appropriate vendor when performing relocations.
This section and later ones contain fragments written in assembler. The precise assembler syntax, including that of the relocations, is described in the RISC-V Assembly Programmer’s Manual [rv-asm].
Note
|
The assembler must allocate sufficient space to accommodate the final
value for the R_RISCV_SET_ULEB128 and R_RISCV_SUB_ULEB128 relocation pair
and fill the space with a single ULEB128-encoded value.
This is achieved by prepending the redundant 0x80 byte as necessary.
The linker must not alter the length of the ULEB128-encoded value.
|
Vendor identifiers are dummy symbols used in the corresponding R_RISCV_VENDOR
relocation (irrespective of ELF class/XLEN) and must be unique amongst all
vendors providing custom relocations. Vendor identifiers may be suffixed with a
tag to provide extra relocations for a given vendor.
Note
|
Please refer to the RISC-V Toolchain Conventions [rv-toolchain-conventions] for the full list. |
Variables used in relocation calculation provides details on the variables used in relocation calculation:
Variable | Description |
---|---|
A |
Addend field in the relocation entry associated with the symbol |
B |
Base address of a shared object loaded into memory |
G |
Offset of the symbol into the GOT (Global Offset Table) |
GOT |
Address of the GOT (Global Offset Table) |
P |
Position of the relocation |
S |
Value of the symbol in the symbol table |
V |
Value at the position of the relocation |
GP |
Value of |
TLSMODULE |
TLS module index for the object containing the symbol |
TLSOFFSET |
TLS static block offset (relative to |
Global Pointer: It is assumed that program startup code will load the value
of the __global_pointer$
symbol into register gp
(aka x3
).
Variables used in relocation fields provides details on the variables used in relocation fields:
Variable | Description |
---|---|
word6 |
Specifies the 6 least significant bits of a word8 field |
word8 |
Specifies an 8-bit word |
word16 |
Specifies a 16-bit word |
word32 |
Specifies a 32-bit word |
word64 |
Specifies a 64-bit word |
ULEB128 |
Specifies a variable-length data encoded in ULEB128 format. |
wordclass |
Specifies a word32 field for ILP32 or a word64 field for LP64 |
B-Type |
Specifies a field as the immediate field in a B-type instruction |
CB-Type |
Specifies a field as the immediate field in a CB-type instruction |
CI-Type |
Specifies a field as the immediate field in a CI-type instruction |
CJ-Type |
Specifies a field as the immediate field in a CJ-type instruction |
I-Type |
Specifies a field as the immediate field in an I-type instruction |
S-Type |
Specifies a field as the immediate field in an S-type instruction |
U-Type |
Specifies a field as the immediate field in an U-type instruction |
J-Type |
Specifies a field as the immediate field in a J-type instruction |
U+I-Type |
Specifies a field as the immediate fields in a U-type and I-type instruction pair |
Constants used in relocation fields provides details on the constants used in relocation fields:
Name | Value |
---|---|
TLS_DTV_OFFSET |
0x800 |
32-bit absolute addresses in position dependent code are loaded with a pair
of instructions which have an associated pair of relocations:
R_RISCV_HI20
plus R_RISCV_LO12_I
or R_RISCV_LO12_S
.
The R_RISCV_HI20
refers to an LUI
instruction containing the high
20-bits to be relocated to an absolute symbol address. The LUI
instruction
is used in conjunction with one or more I-Type instructions (add immediate or
load) with R_RISCV_LO12_I
relocations or S-Type instructions (store) with
R_RISCV_LO12_S
relocations.
The addresses for pair of relocations are
calculated like this:
HI20 |
|
LO12 |
|
The following assembly and relocations show loading an absolute address:
lui a0, %hi(symbol) # R_RISCV_HI20 (symbol)
addi a0, a0, %lo(symbol) # R_RISCV_LO12_I (symbol)
A symbol can be loaded in multiple fragments using different addends, where
multiple instructions associated with R_RISCV_LO12_I
/R_RISCV_LO12_S
share a
single R_RISCV_HI20
. The HI20 values for the multiple fragments must be
identical, a condition met when the symbol is sufficiently aligned.
lui a0, 0 # R_RISCV_HI20 (symbol)
lw a1, 0(a0) # R_RISCV_LO12_I (symbol)
lw a2, 0(a0) # R_RISCV_LO12_I (symbol+4)
lw a3, 0(a0) # R_RISCV_LO12_I (symbol+8)
lw a0, 0(a0) # R_RISCV_LO12_I (symbol+12)
For position independent code in dynamically linked objects, each shared object contains a GOT (Global Offset Table), which contains addresses of global symbols (objects and functions) referred to by the dynamically linked shared object. The GOT in each shared library is filled in by the dynamic linker during program loading, or on the first call to extern functions.
To avoid dynamic relocations within the text segment of position independent code the GOT is used for indirection. Instead of code loading virtual addresses directly, as can be done in static code, addresses are loaded from the GOT. This allows runtime binding to external objects and functions at the expense of a slightly higher runtime overhead for access to extern objects and functions.
The PLT (Procedure Linkage Table) exists to allow function calls between dynamically linked shared objects. Each dynamic object has its own GOT (Global Offset Table) and PLT (Procedure Linkage Table).
The first entry of a shared object PLT is a special entry that calls
_dl_runtime_resolve
to resolve the GOT offset for the called function.
The _dl_runtime_resolve
function in the dynamic loader resolves the
GOT offsets lazily on the first call to any function, except when
LD_BIND_NOW
is set in which case the GOT entries are populated by the
dynamic linker before the executable is started. Lazy resolution of GOT
entries is intended to speed up program loading by deferring symbol
resolution to the first time the function is called. The first entry
in the PLT occupies two 16 byte entries:
1: auipc t2, %pcrel_hi(.got.plt)
sub t1, t1, t3 # shifted .got.plt offset + hdr size + 12
l[w|d] t3, %pcrel_lo(1b)(t2) # _dl_runtime_resolve
addi t1, t1, -(hdr size + 12) # shifted .got.plt offset
addi t0, t2, %pcrel_lo(1b) # &.got.plt
srli t1, t1, log2(16/PTRSIZE) # .got.plt offset
l[w|d] t0, PTRSIZE(t0) # link map
jr t3
Subsequent function entry stubs in the PLT take up 16 bytes and load a
function pointer from the GOT. On the first call to a function, the
entry redirects to the first PLT entry which calls _dl_runtime_resolve
and fills in the GOT entry for subsequent calls to the function:
1: auipc t3, %pcrel_hi([email protected])
l[w|d] t3, %pcrel_lo(1b)(t3)
jalr t1, t3
nop
R_RISCV_CALL
and R_RISCV_CALL_PLT
relocations are associated with
pairs of instructions (AUIPC+JALR
) generated by the CALL
or TAIL
pseudoinstructions. Originally, these relocations had slightly different
behavior, but that has turned out to be unnecessary, and they are now
interchangeable, R_RISCV_CALL
is deprecated, suggest using R_RISCV_CALL_PLT
instead.
With linker relaxation enabled, the AUIPC
instruction in the AUIPC+JALR
pair has
both a R_RISCV_CALL
or R_RISCV_CALL_PLT
relocation and an R_RISCV_RELAX
relocation indicating the instruction sequence can be relaxed during linking.
Procedure call linker relaxation allows the AUIPC+JALR
pair to be relaxed
to the JAL
instruction when the procedure or PLT entry is within (-1MiB to
+1MiB-2) of the instruction pair.
The pseudoinstruction:
call symbol
call symbol@plt
expands to the following assembly and relocation:
auipc ra, 0 # R_RISCV_CALL (symbol), R_RISCV_RELAX (symbol)
jalr ra, ra, 0
and when symbol has an @plt
suffix it expands to:
auipc ra, 0 # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX (symbol)
jalr ra, ra, 0
Unconditional jump (J-Type) instructions have a R_RISCV_JAL
relocation
that can represent an even signed 21-bit offset (-1MiB to +1MiB-2).
Branch (SB-Type) instructions have a R_RISCV_BRANCH
relocation that
can represent an even signed 13-bit offset (-4096 to +4094).
32-bit PC-relative relocations for symbol addresses on sequences of
instructions such as the AUIPC+ADDI
instruction pair expanded from
the la
pseudoinstruction, in position independent code typically
have an associated pair of relocations: R_RISCV_PCREL_HI20
plus
R_RISCV_PCREL_LO12_I
or R_RISCV_PCREL_LO12_S
.
The R_RISCV_PCREL_HI20
relocation refers to an AUIPC
instruction
containing the high 20-bits to be relocated to a symbol relative to the
program counter address of the AUIPC
instruction. The AUIPC
instruction is used in conjunction with one or more I-Type instructions
(add immediate or load) with R_RISCV_PCREL_LO12_I
relocations or S-Type
instructions (store) with R_RISCV_PCREL_LO12_S
relocations.
The R_RISCV_PCREL_LO12_I
or R_RISCV_PCREL_LO12_S
relocations contain
a label pointing to an instruction in the same section with an
R_RISCV_PCREL_HI20
relocation entry that points to the target symbol:
-
At label:
R_RISCV_PCREL_HI20
relocation entry → symbol -
R_RISCV_PCREL_LO12_I
relocation entry → label
To get the symbol address to perform the calculation to fill the 12-bit
immediate on the add, load or store instruction the linker finds the
R_RISCV_PCREL_HI20
relocation entry associated with the AUIPC
instruction. The addresses for pair of relocations are calculated like this:
HI20 |
|
LO12 |
|
The successive instruction has a signed 12-bit immediate so the value of the preceding high 20-bit relocation may have 1 added to it.
Note the compiler emitted instructions for PC-relative symbol addresses are
not necessarily sequential or in pairs. There is a constraint is that the
instruction with the R_RISCV_PCREL_LO12_I
or R_RISCV_PCREL_LO12_S
relocation label points to a valid HI20 PC-relative relocation pointing to
the symbol.
Here is example assembler showing the relocation types:
label:
auipc t0, %pcrel_hi(symbol) # R_RISCV_PCREL_HI20 (symbol)
lui t1, 1
lw t2, t0, %pcrel_lo(label) # R_RISCV_PCREL_LO12_I (label)
add t2, t2, t1
sw t2, t0, %pcrel_lo(label) # R_RISCV_PCREL_LO12_S (label)
The relocation type R_RISCV_ALIGN
marks a location that must be aligned to
N
-bytes, where N
is the smallest power of two that is greater than the value
of the addend field, e.g. R_RISCV_ALIGN
with addend value 2 means align to 4
bytes, R_RISCV_ALIGN
with addend value 4 means align to 8 bytes; this
relocation is only required if the containing section has any R_RISCV_RELAX
relocations, R_RISCV_ALIGN
points to the beginning of the padding bytes,
and the instruction that actually needs to be aligned is located at the point
of R_RISCV_ALIGN
plus its addend.
To ensure the linker can always satisfy the required alignment solely by
deleting bytes, the compiler or assembler must emit a R_RISCV_ALIGN
relocation
and then insert N
- [IALIGN] padding bytes before the location where we need to
align, it could be mark by an alignment directive like .align
, .p2align
or
.balign
or emit by compiler directly, the addend value of that relocation
is the number of padding bytes.
The compiler and assembler must ensure padding bytes are valid instructions
without any side-effect like nop
or c.nop
, and make sure those instructions
are aligned to IALIGN if possible.
The linker may remove part of the padding bytes at the linking process to meet the alignment requirement, and must make sure those padding bytes still are valid instructions and each instruction is aligned to at least IALIGN byte.
Here is example to showing how R_RISCV_ALIGN
is used:
0x0 c.nop # R_RISCV_ALIGN with addend 2
0x2 add t1, t2, t3 # This instruction must align to 4 byte.
Note
|
R_RISCV_ALIGN relocation is needed because linker relaxation can shrink
preceding code during the linking process, which may cause an aligned location
to become mis-aligned.
|
Note
|
Here is pseudocode to decide the alignment of R_RISCV_ALIGN relocation:
|
# input:
# addend: addend value of relocation with R_RISCV_ALIGN type.
# output:
# Alignment of this relocation.
def align(addend):
ALIGN = 1
while addend >= ALIGN:
ALIGN *= 2
return ALIGN
RISC-V adopts the ELF Thread Local Storage Model in which ELF objects define
.tbss
and .tdata
sections and PT_TLS
program headers that contain the
TLS "initialization images" for new threads. The .tbss
and .tdata
sections
are not referenced directly like regular segments, rather they are copied or
allocated to the thread local storage space of newly created threads.
See ELF Handling For Thread-Local Storage [tls].
In The ELF Thread Local Storage Model, TLS offsets are used instead of pointers.
The ELF TLS sections are initialization images for the thread local variables of
each new thread. A TLS offset defines an offset into the dynamic thread vector
which is pointed to by the TCB (Thread Control Block). RISC-V uses Variant I as
described by the ELF TLS specification, with tp
containing the address one
past the end of the TCB.
There are various thread local storage models for statically allocated or dynamically allocated thread local storage. TLS models lists the thread local storage models:
Mnemonic | Model |
---|---|
TLS LE |
Local Exec |
TLS IE |
Initial Exec |
TLS LD |
Local Dynamic |
TLS GD |
Global Dynamic |
The program linker in the case of static TLS or the dynamic linker in the case of dynamic TLS allocate TLS offsets for storage of thread local variables.
Note
|
Global Dynamic model is also known as General Dynamic model.
|
Local exec is a form of static thread local storage. This model is used when static linking as the TLS offsets are resolved during program linking.
- Variable attribute
-
__thread int i __attribute__((tls_model("local-exec")));
Example assembler load and store of a thread local variable i
using the
%tprel_hi
, %tprel_add
and %tprel_lo
assembler functions. The emitted
relocations are in comments.
lui a5,%tprel_hi(i) # R_RISCV_TPREL_HI20 (symbol)
add a5,a5,tp,%tprel_add(i) # R_RISCV_TPREL_ADD (symbol)
lw t0,%tprel_lo(i)(a5) # R_RISCV_TPREL_LO12_I (symbol)
addi t0,t0,1
sw t0,%tprel_lo(i)(a5) # R_RISCV_TPREL_LO12_S (symbol)
The %tprel_add
assembler function does not return a value and is used purely
to associate the R_RISCV_TPREL_ADD
relocation with the add
instruction.
Initial exec is is a form of static thread local storage that can be used in
shared libraries that use thread local storage. TLS relocations are performed
at load time. dlopen
calls to libraries that use thread local storage may fail
when using the initial exec thread local storage model as TLS offsets must all
be resolved at load time. This model uses the GOT to resolve TLS offsets.
- Variable attribute
-
__thread int i __attribute__((tls_model("initial-exec")));
- ELF flags
-
DF_STATIC_TLS
Example assembler load and store of a thread local variable i
using the
la.tls.ie
pseudoinstruction, with the emitted TLS relocations in comments:
la.tls.ie a5,i
add a5,a5,tp
lw t0,0(a5)
addi t0,t0,1
sw t0,0(a5)
The assembler pseudoinstruction:
la.tls.ie a5,symbol
expands to the following assembly instructions and relocations:
label:
auipc a5, 0 # R_RISCV_TLS_GOT_HI20 (symbol)
{ld,lw} a5, 0(a5) # R_RISCV_PCREL_LO12_I (label)
RISC-V local dynamic and global dynamic TLS models generate equivalent object code.
The Global dynamic thread local storage model is used for PIC Shared libraries and
handles the case where more than one library uses thread local variables, and
additionally allows libraries to be loaded and unloaded at runtime using dlopen
.
In the global dynamic model, application code calls the dynamic linker function
__tls_get_addr
to locate TLS offsets into the dynamic thread vector at runtime.
- Variable attribute
-
__thread int i __attribute__((tls_model("global-dynamic")));
Example assembler load and store of a thread local variable i
using the
la.tls.gd
pseudoinstruction, with the emitted TLS relocations in comments:
la.tls.gd a0,i
call __tls_get_addr@plt
mv a5,a0
lw t0,0(a5)
addi t0,t0,1
sw t0,0(a5)
The assembler pseudoinstruction:
la.tls.gd a0,symbol
expands to the following assembly instructions and relocations:
label:
auipc a0,0 # R_RISCV_TLS_GD_HI20 (symbol)
addi a0,a0,0 # R_RISCV_PCREL_LO12_I (label)
In the Global Dynamic model, the runtime library provides the __tls_get_addr
function:
extern void *__tls_get_addr (tls_index *ti);
where the type tls_index is defined as:
typedef struct
{
unsigned long int ti_module;
unsigned long int ti_offset;
} tls_index;
TLS Descriptors (TLSDESC) are an alternative implementation of the Global Dynamic model
that allows the dynamic linker to achieve performance close to that
of Initial Exec when the library was not loaded dynamically with dlopen
.
The linker reserves a consecutive pair of pointer-sized entry in the GOT for each TLSDESC
relocation. At runtime, the dynamic linker fills in the TLS descriptor entry as defined below:
typedef struct
{
unsigned long (*entry)(tls_descriptor *);
unsigned long arg;
} tls_descriptor;
Upon accessing the thread local variable, the entry
function is called with the address
of tls_descriptor
containing it, returning <address of thread local variable> - tp
.
The TLS descriptor entry
is called with a special calling convention, specified as follows:
-
a0
is used to pass the argument and return value. -
t0
is used as the link register. -
Any other registers are callee-saved. This includes any vector registers when the vector extension is supported.
Example assembler load and store of a thread local variable i
using the %tlsdesc_hi
, %tlsdesc_load_lo
, %tlsdesc_add_lo
and %tlsdesc_call
assembler functions. The emitted relocations are in the comments.
label:
auipc tX, %tlsdesc_hi(symbol) // R_RISCV_TLSDESC_HI20 (symbol)
lw tY, tX, %tlsdesc_load_lo(label) // R_RISCV_TLSDESC_LOAD_LO12 (label)
addi a0, tX, %tlsdesc_add_lo(label) // R_RISCV_TLSDESC_ADD_LO12 (label)
jalr t0, tY, %tlsdesc_call(label) // R_RISCV_TLSDESC_CALL (label)
tX
and tY
in the example may be replaced with any combination of two general purpose registers.
The %tlsdesc_call
assembler function does not return a value and is used purely
to associate the R_RISCV_TLSDESC_CALL
relocation with the jalr
instruction.
The linker can use the relocations to recognize the sequence and to perform relaxations. To ensure correctness, only the following changes to the sequence are allowed:
-
Instructions outside the sequence that do not clobber the registers used within the sequence may be inserted in-between the instructions of the sequence (known as instruction scheduling).
-
Instructions in the sequence with no data dependency may be reordered. In the preceding example, the only instructions that can be reordered are
lw
andaddi
.
The defined processor-specific section types are listed in RISC-V-specific section types.
Name | Value | Attributes |
---|---|---|
SHT_RISCV_ATTRIBUTES |
0x70000003 |
none |
RISC-V-specific sections lists the special sections defined by this ABI.
Name | Type | Attributes |
---|---|---|
.riscv.attributes |
SHT_RISCV_ATTRIBUTES |
none |
.riscv.jvt |
SHT_PROGBITS |
SHF_ALLOC + SHF_EXECINSTR |
.riscv.attributes names a section that contains RISC-V ELF attributes.
.riscv.jvt is a linker-created section to store table jump target addresses. The minimum alignment of this section is 64 bytes.
The defined processor-specific segment types are listed in RISC-V-specific segment types.
Name | Value | Meaning |
---|---|---|
PT_RISCV_ATTRIBUTES |
0x70000003 |
RISC-V ELF attribute section. |
PT_RISCV_ATTRIBUTES
describes the location of RISC-V ELF attribute section.
The defined processor-specific dynamic array tags are listed in RISC-V-specific dynamic array tags.
Name | Value | d_un | Executable | Shared Object |
---|---|---|---|---|
DT_RISCV_VARIANT_CC |
0x70000001 |
d_val |
Platform specific |
Platform specific |
An object must have the dynamic tag DT_RISCV_VARIANT_CC
if it has one or more
R_RISCV_JUMP_SLOT
relocations against symbols with the STO_RISCV_VARIANT_CC
attribute.
DT_INIT
and DT_FINI
are not required to be supported and should be avoided
in favour of DT_PREINIT_ARRAY
, DT_INIT_ARRAY
and DT_FINI_ARRAY
.
Attributes are used to record information about an object file/binary that a linker or runtime loader needs to check compatibility.
Attributes are encoded in a vendor-specific section of type SHT_RISCV_ATTRIBUTES and name .riscv.attributes. The value of an attribute can hold an integer encoded in the uleb128 format or a null-terminated byte string (NTBS). The tag number is also encoded as uleb128.
In order to improve the compatibility of the tool, the attribute follows below rules:
-
RISC-V attributes have a string value if the tag number is odd and an integer value if the tag number is even.
-
The tag is mandatory; If the tool does not recognize this attribute and the tag number modulo 128 is less than 64 (
(N % 128) < 64
), errors should be reported. -
The tag is optional; If the tool does not recognize this attribute and the tag number modulo 128 is greater than or equal to 64 (
(N % 128) >= 64
), the tag can be ignored.
The attributes section start with a format-version (uint8 = 'A') followed by vendor specific sub-section(s). A sub-section starts with sub-section length (uint32), vendor name (NTBS) and one or more sub-sub-section(s).
A sub-sub-section consists of a tag (uleb128), sub-sub-section length (uint32) followed by actual attribute tag,value pair(s) as specified above. Sub-sub-section Tag Tag_file (value 1) specifies that contained attibutes relate to whole file.
A sub-section with name "riscv\0" is mandatory. Vendor specific sub-sections are allowed in future. Vendor names starting with "[Aa]non" are reserved for non-standard ABI extensions.
Tag | Value | Parameter type | Description |
---|---|---|---|
Tag_RISCV_stack_align |
4 |
uleb128 |
Indicates the stack alignment requirement in bytes. |
Tag_RISCV_arch |
5 |
NTBS |
Indicates the target architecture of this object. |
Tag_RISCV_unaligned_access |
6 |
uleb128 |
Indicates whether to impose unaligned memory accesses in code generation. |
Tag_RISCV_priv_spec |
8 |
uleb128 |
Deprecated, indicates the major version of the privileged specification. |
Tag_RISCV_priv_spec_minor |
10 |
uleb128 |
Deprecated, indicates the minor version of the privileged specification. |
Tag_RISCV_priv_spec_revision |
12 |
uleb128 |
Deprecated, indicates the revision version of the privileged specification. |
Tag_RISCV_atomic_abi |
14 |
uleb128 |
Indicates which version of the atomics ABI is being used. |
Tag_RISCV_x3_reg_usage |
16 |
uleb128 |
Indicates the usage definition of the X3 register. |
Reserved for non-standard attribute |
>= 32768 |
- |
- |
Each attribute is described in the following structure:
<Tag name>, <Value>, <Parameter type 1>=<Parameter name 1>[, <Parameter type 2>=<Parameter name 2>]
Tag_RISCV_stack_align records the N-byte stack alignment for this object. The default value is 16 for RV32I or RV64I, and 4 for RV32E.
- Merge Policy
-
The linker should report erros if link object files with different
Tag_RISCV_stack_align
values.
Tag_RISCV_arch contains a string for the target architecture taken from
the option -march
. Different architectures will be integrated into a superset
when object files are merged.
Tag_RISCV_arch should be recorded in lowercase, and all extensions should be
separated by underline(_
).
Note that the version information for target architecture must be presented
explicitly in the attribute and abbreviations must be expanded. The version
information, if not given by -march
, must agree with the default
specified by the tool. For example, the architecture rv32i
has to be recorded
in the attribute as rv32i2p1
in which 2p1
stands for the default version of
its based ISA. On the other hand, the architecture rv32g
has to be presented
as rv32i2p1_m2p0_a2p1_f2p2_d2p2_zicsr2p0_zifencei2p0
in which the
abbreviation g
is expanded to the imafd_zicsr_zifencei
combination with
default versions of the standard extensions.
The toolchain should normalized the architecture string into canonical order which defined in The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document [riscv-unpriv] , expanded with all required extension and should add shorthand extension into architecture string if all expanded extensions are included in architecture string.
Note
|
A shorthand extension is an extension that does not define any actual
instructions, registers or behavior, but requires other extensions, such as the
zks extension, which is defined in the cryptographic extension,
zks extension is shorthand for zbkb , zbkc , zbkx , zksed and zksh , so
the toolchain should normalize rv32i_zbkb_zbkc_zbkx_zksed_zksh to
rv32i_zbkb_zbkc_zbkx_zks_zksed_zksh ; g is an exception and does not apply to
this rule.
|
- Merge Policy
-
The linker should merge the different architectures into a superset when object files are merged, and should report errors if the merge result contains conflict extensions.
This specification does not mandate rules on how to merge ISA strings that refer to different versions of the same ISA extension. The suggested merge rules are as follows:
-
Merge versions into the latest version of all input versions that are ratified without warning or error.
-
The linker should emit a warning or error if input versions have different versions and any extension versions are not ratified.
-
The linker may report a warning or error if it detects incompatible versions, even if it’s ratified.
-
Note
|
Example of conflicting merge result: RV32IF and RV32IZfinx will
be merged into RV32IFZfinx , which is an invalid architecture since F and
Zfinx conflict.
|
Tag_RISCV_unaligned_access denotes the code generation policy for this object file. Its values are defined as follows:
0 |
This object does not perform any unaligned memory accesses. |
1 |
This object may perform unaligned memory accesses. |
- Merge policy
-
Input file could have different values for the Tag_RISCV_unaligned_access; the linker should set this field into 1 if any of the input objects has been set.
Warning
|
Those three attributes are deprecated since RISC-V using extensions with version rather than a single privileged specification version scheme for privileged ISA. |
Tag_RISCV_priv_spec contains the major/minor/revision version information of the privileged specification.
- Merge policy
-
The linker should report errors if object files of different privileged specification versions are merged.
Tag_RISCV_atomic_abi denotes the atomic ABI used within this object file. Its values are defined as follows:
Value | Symbolic Name | Description |
---|---|---|
0 |
UNKNOWN |
This object uses unknown atomic ABI. |
1 |
A6C |
This object uses the A6 classical atomic ABI, which is defined in table A.6 in [riscv-unpriv-20191213]. |
2 |
A6S |
This object uses the strengthened A6 ABI, which uses the atomic mapping defined by [Mappings from C/C++ primitives to RISC-V primitives] and does not rely on any note 3 annotated mappings. |
3 |
A7 |
This object uses the A7 atomic ABI, which uses the atomic mapping defined by [Mappings from C/C++ primitives to RISC-V primitives] and may rely on note 3 annotated mappings. |
- Merge policy
-
The linker should report errors if object files with incompatible atomics ABIs are merged; the compatibility rules for atomic ABIs can be found in the compatibility column in the following table.
Input Values | Compatible? | Ouput Value |
---|---|---|
UNKNOWN and A6C |
Yes |
A6C |
UNKNOWN and A6S |
Yes |
A6S |
UNKNOWN and A7 |
Yes |
A7 |
A6C and A6S |
Yes |
A6C |
A6C and A7 |
No |
- |
A6S and A7 |
Yes |
A7 |
Note
|
Merging object files with the same ABI will result in the same ABI. |
Note
|
Programs that implement atomic operations without relying on the A-extension are classified as UNKNOWN for now. A new value for those may be defined in the future. |
Tag_RISCV_x3_reg_usage indicates the usage of x3
/gp
register. x3
/gp
could be used for
global pointer relaxation, as a reserved platform register, or as a temporary register.
0 |
This object uses |
1 |
This object uses |
2 |
This object uses |
3 |
This object uses |
4~1023 |
Reserved for future standard defined platform register. |
1024~2047 |
Reserved for nonstandard defined platform register. |
- Merge policy
-
The linker should issue errors when object files with differing
gp
usage are combined. However, an exception exists: the value0
can merge with1
or2
value. After the merge, the resulting value will be the non-zero one.
The section can have a mixture of code and data or code with different ISAs. A number of symbols, named mapping symbols, describe the boundaries.
Symbol Name |
Meaning |
$d |
Start of a sequence of data. |
$d.<any> |
|
$x |
Start of a sequence of instructions. |
$x.<any> |
|
$x<ISA> |
Start of a sequence of instructions with <ISA> extension. |
$x<ISA>.<any> |
The mapping symbol should set the type to STT_NOTYPE
, binding to STB_LOCAL
,
and the size of symbol to zero.
The mapping symbol for data($d
) indicates the start of a sequence of data bytes.
The mapping symbol for instruction($x
) indicates the start of a sequence of
instructions.
and it has an optional ISA string, which means the following code regions are
using ISA is different than the ISA recorded in the arch attribute;
the ISA information will used until the next instruction mapping symbol;
an instruction mapping symbol without ISA string means using ISA configuration
from ELF attribute.
Format and rule of the optional ISA string are same as Tag_RISCV_arch
, must
having explicit version, more detailed rule please refer to Attributes.
The mapping symbol can be followed by an optional uniquifier, which is prefixed
with a dot (.
).
Note
|
The use case for mapping symbol for instruction($x ) with ISA information
is used with ifunc, e.g. libraries are built with rv64gc , but few functions
like memcpy provides two versions, one built with rv64gc , and one built with
rv64gcv , and select by ifunc mechanism at run-time; however, the arch
attribute is recording for minimal execution environment requirements, so the
ISA information from arch attribute is not enough for the disassembler to
disassemble the rv64gcv version correctly.
|
At link time, when all the memory objects have been resolved, the code sequence used to refer to them may be simplified and optimized by the linker by relaxing some assumptions about the memory layout made at compile time.
Some relocation types, in certain situations, indicate to the linker where this can happen. Additionally, some relocation types indicate to the linker the associated parts of a code sequence that can be thusly simplified, rather than to instruct the linker how to apply a relocation.
The linker should only perform such relaxations when a R_RISCV_RELAX relocation is at the same position as a candidate relocation.
As this transformation may delete bytes (and thus invalidate references that are commonly resolved at compile-time, such as intra-function jumps), code generators must in general ensure that relocations are always emitted when relaxation is enabled.
Linkers should adjust relocations that refer to symbols whose addresses have been updated.
ULEB128 value with relocation must be padding to the same length even if the data can be encoded with a shorter byte sequence after linker relaxation, The linker should report errors if the length of ULEB128 byte sequence is more extended than the current byte sequence.
The purpose of this section is to describe all types of linker relaxation, the linker may implement a part of linker relaxation type, and can be skipped the relaxation type is unsupported.
Each candidate relocation might fit more than one relaxation type, the linker should only apply one relaxation type.
In the linker relaxation optimization, we introduce a concept called relocation
group; a relocation group consists of 1) relocations associated with the same
target symbol and can be applied with the same relaxation, or 2) relocations
with the linkage relationship (e.g. R_RISCV_PCREL_LO12_S
linked with
a R_RISCV_PCREL_HI20
); all relocations in a single group must be present in
the same section, otherwise will split into another relocation group.
Every relocation group must apply the same relaxation type, and the linker should not apply linker relaxation to only part of the relocation group.
Note
|
Applying relaxation on the part of the relocation group might result in a
wrong execution result; for example, a relocation group consists of
lui t0, 0 # R_RISCV_HI20 (foo) , lw t1, 0(t0) # R_RISCV_LO12_I (foo) , and we
only apply global pointer relaxation on first instruction, then
remove that instruction, and didn’t apply relaxation on the second instruction,
which made the load instruction reference to an unspecified address.
|
- Target Relocation
-
R_RISCV_CALL, R_RISCV_CALL_PLT.
- Description
-
This relaxation type can relax
AUIPC+JALR
intoJAL
. - Condition
-
The offset between the location of relocation and target symbol or the PLT stub of the target symbol is within +-1MiB.
- Relaxation
-
-
Instruction sequence associated with
R_RISCV_CALL
orR_RISCV_CALL_PLT
can be rewritten to a single JAL instruction with the offset between the location of relocation and target symbol.
-
- Example
-
Relaxation candidate:
auipc ra, 0 # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX jalr ra, ra, 0
Relaxation result:
jal ra, 0 # R_RISCV_JAL (symbol)
Note
|
Using address of PLT stubs of the target symbol or address target symbol directly will resolve by linker according to the visibility of the target symbol. |
- Target Relocation
-
R_RISCV_CALL, R_RISCV_CALL_PLT.
- Description
-
This relaxation type can relax
AUIPC+JALR
intoC.JAL
instruction sequence. - Condition
-
The offset between the location of relocation and target symbol or the PLT stub of the target symbol is within +-2KiB and rd operand of second instruction in the instruction sequence is
X1
/RA
and if it is RV32. - Relaxation
-
-
Instruction sequence associated with
R_RISCV_CALL
orR_RISCV_CALL_PLT
can be rewritten to a singleC.JAL
instruction with the offset between the location of relocation and target symbol.
-
- Example
-
Relaxation candidate:
auipc ra, 0 # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX jalr ra, ra, 0
Relaxation result:
c.jal ra, <offset-between-pc-and-symbol>
- Target Relocation
-
R_RISCV_CALL, R_RISCV_CALL_PLT.
- Description
-
This relaxation type can relax
AUIPC+JALR
intoC.J
instruction sequence. - Condition
-
The offset between the location of relocation and target symbol or the PLT stub of the target symbol is within +-2KiB and rd operand of second instruction in the instruction sequence is
X0
. - Relaxation
-
-
Instruction sequence associated with
R_RISCV_CALL
orR_RISCV_CALL_PLT
can be rewritten to a singleC.J
instruction with the offset between the location of relocation and target symbol.
-
- Example
-
Relaxation candidate:
auipc ra, 0 # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX jalr x0, ra, 0
Relaxation result:
c.j ra, <offset-between-pc-and-symbol>
- Target Relocation
-
R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S, R_RISCV_PCREL_HI20, R_RISCV_PCREL_LO12_I, R_RISCV_PCREL_LO12_S
- Description
-
This relaxation type can relax a sequence of the load address of a symbol or load/store with a symbol reference into global-pointer-relative instruction.
- Condition
-
Global-pointer relaxation requires that Tag_RISCV_x3_reg_usage must be 0 or 1, and offset between global-pointer and symbol is within +-2KiB,
R_RISCV_PCREL_LO12_I
andR_RISCV_PCREL_LO12_S
resolved as indirect relocation pointer. It will always point to anotherR_RISCV_PCREL_HI20
relocation, the symbol pointed byR_RISCV_PCREL_HI20
will be used in the offset calculation. - Relaxation
-
-
Instruction associated with
R_RISCV_HI20
orR_RISCV_PCREL_HI20
can be removed. -
Instruction associated with
R_RISCV_LO12_I
,R_RISCV_LO12_S
,R_RISCV_PCREL_LO12_I
orR_RISCV_PCREL_LO12_S
can be replaced with a global-pointer-relative access instruction.
-
- Example
-
Relaxation candidate (
tX
andtY
can be any combination of two general purpose registers):lui tX, 0 # R_RISCV_HI20 (symbol), R_RISCV_RELAX lw tY, 0(tX) # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
Relaxation result:
lw tY, <gp-offset-for-symbol>(gp)
A symbol can be loaded in multiple fragments using different addends, where multiple instructions associated with
R_RISCV_LO12_I
/R_RISCV_LO12_S
share a singleR_RISCV_HI20
. The HI20 values for the multiple fragments must be identical and all the relaxed global-pointer offsets must be in range.Relaxation candidate:
lui tX, 0 # R_RISCV_HI20 (symbol), R_RISCV_RELAX lw tY, 0(tX) # R_RISCV_LO12_I (symbol), R_RISCV_RELAX lw tZ, 0(tX+4) # R_RISCV_LO12_I (symbol+4), R_RISCV_RELAX lw tW, 0(tX+8) # R_RISCV_LO12_I (symbol+8), R_RISCV_RELAX lw tX, 0(tX+12) # R_RISCV_LO12_I (symbol+12), R_RISCV_RELAX
Relaxation result:
lw tY, <gp-offset-for-symbol>(gp) lw tZ, <gp-offset-for-symbol+4>(gp) lw tW, <gp-offset-for-symbol+8>(gp) lw tX, <gp-offset-for-symbol+12>(gp)
Note
|
The global-pointer refers to the address of the __global_pointer$
symbol, which is the content of gp register.
|
Note
|
This relaxation requires the program to initialize the gp register with
the address of __global_pointer$ symbol before accessing any symbol address,
strongly recommended initialize gp at the beginning of the program entry
function like _start , and code fragments of initialization must disable
linker relaxation to prevent initialization instruction relaxed into a NOP-like
instruction (e.g. mv gp, gp ).
|
# Recommended way to initialize the gp register.
.option push
.option norelax
1: auipc gp, %pcrel_hi(__global_pointer$)
addi gp, gp, %pcrel_lo(1b)
.option pop
Note
|
The global pointer is referred to as the global offset table pointer in
many other targets, however, RISC-V uses PC-relative addressing rather than
access GOT via the global pointer register (gp ), so we use gp register to
optimize code size and performance of the symbol accessing.
|
Note
|
Tag_RISCV_x3_reg_usage is treated as 0 if it is not present. |
- Target Relocation
-
R_RISCV_GOT_HI20, R_RISCV_PCREL_LO12_I
- Description
-
This relaxation can relax a GOT indirection into load immediate or PC-relative addressing. This relaxation is intended to optimize the
lga
assembly pseudo-instruction (and thusla
for PIC objects), which loads a symbol’s address from a GOT entry with anauipc
+l[w|d]
instruction pair. - Condition
-
-
Both
R_RISCV_GOT_HI20
andR_RISCV_PCREL_LO12_I
are marked withR_RISCV_RELAX
. -
The symbol pointed to by
R_RISCV_PCREL_LO12_I
is at the location to whichR_RISCV_GOT_HI20
refers. -
If the symbol is absolute, its address is within
0x0
~0x7ff
or0xfffffffffffff800
~0xffffffffffffffff
for RV64 and0xfffff800
~0xffffffff
for RV32. Note that an undefined weak symbol satisfies this condition because such a symbol is handled as if it were an absolute symbol at address 0. -
If the symbol is relative, it’s bound at link time to be within the object. It should not be of the GNU ifunc type. Additionally, the offset between the location to which
R_RISCV_GOT_HI20
refers and the target symbol should be within a range of +-2GiB.
-
- Relaxation
-
-
The
auipc
instruction associated withR_RISCV_GOT_HI20
can be removed if the symbol is absolute. -
The instruction or instructions associated with
R_RISCV_PCREL_LO12_I
can be rewritten to eitherc.li
oraddi
to materialize the symbol’s address directly in a register. -
If this relaxation eliminates all references to the symbol’s GOT slot, the linker may opt not to create a GOT slot for that symbol.
-
- Example
-
Relaxation candidate:
label: auipc tX, 0 # R_RISCV_GOT_HI20 (symbol), R_RISCV_RELAX l[w|d] tY, 0(tX) # R_RISCV_PCREL_LO12_I (label), R_RISCV_RELAX
Relaxation result (absolute symbol whose address can be represented as a 6-bit signed integer and if the RVC instruction is permitted):
c.li tY, <symbol-value>
Relaxation result (absolute symbol and did not meet the above condition to use
c.li
):addi tY, zero, <symbol-value>
Relaxation result (relative symbol):
auipc tX, <hi> addi tY, tX, <lo>
- Target Relocation
-
R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S
- Description
-
This relaxation type can relax a sequence of the load address of a symbol or load/store with a symbol reference into shorter instruction sequence if possible.
- Condition
-
The symbol address located within
0x0
~0x7ff
or0xfffffffffffff800
~0xffffffffffffffff
for RV64 and0xfffff800
~0xffffffff
for RV32. - Relaxation
-
-
Instruction associated with
R_RISCV_HI20
can be removed if the symbol address satisfies the x0-relative access. -
Instruction associated with
R_RISCV_LO12_I
orR_RISCV_LO12_S
can be relaxed into x0-relative access.
-
- Example
-
Relaxation candidate:
lui t0, 0 # R_RISCV_HI20 (symbol), R_RISCV_RELAX lw t1, 0(t0) # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
Relaxation result:
lw t1, <address-of-symbol>(x0)
- Target Relocation
-
R_RISCV_HI20, R_RISCV_LO12_I, R_RISCV_LO12_S
- Description
-
This relaxation type can relax a sequence of the load address of a symbol or load/store with a symbol reference into shorter instruction sequence if possible.
- Condition
-
The symbol address can be presented by a
C.LUI
plus anADDI
or load / store instruction. - Relaxation
-
-
Instruction associated with
R_RISCV_HI20
can be replaced withC.LUI
. -
Instruction associated with
R_RISCV_LO12_I
orR_RISCV_LO12_S
should keep unchanged.
-
- Example
-
Relaxation candidate:
lui t0, 0 # R_RISCV_HI20 (symbol), R_RISCV_RELAX lw t1, 0(t0) # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
Relaxation result:
c.lui t0, <non-zero> # RVC_LUI (symbol), R_RISCV_RELAX lw t1, 0(t0) # R_RISCV_LO12_I (symbol), R_RISCV_RELAX
- Target Relocation
-
R_RISCV_TPREL_HI20, R_RISCV_TPREL_ADD, R_RISCV_TPREL_LO12_I, R_RISCV_TPREL_LO12_S.
- Description
-
This relaxation type can relax a sequence of the load address of a symbol or load/store with a thread-local symbol reference into a thread-pointer-relative instruction.
- Condition
-
Offset between thread-pointer and thread-local symbol is within +-2KiB.
- Relaxation
-
-
Instruction associated with
R_RISCV_TPREL_HI20
orR_RISCV_TPREL_ADD
can be removed. -
Instruction associated with
R_RISCV_TPREL_LO12_I
orR_RISCV_TPREL_LO12_S
can be replaced with a thread-pointer-relative access instruction.
-
- Example
-
Relaxation candidate:
lui t0, 0 # R_RISCV_TPREL_HI20 (symbol), R_RISCV_RELAX add t0, t0, tp # R_RISCV_TPREL_ADD (symbol), R_RISCV_RELAX lw t1, 0(t0) # R_RISCV_TPREL_LO12_I (symbol), R_RISCV_RELAX
Relaxation result:
lw t1, <tp-offset-for-symbol>(tp)
- Target Relocation
-
R_RISCV_TLSDESC_HI20, R_RISCV_TLSDESC_LOAD_LO12, R_RISCV_TLSDESC_ADD_LO12, R_RISCV_TLSDESC_CALL
- Description
-
This relaxation can relax a sequence loading the address of a thread-local symbol reference into a GOT load instruction.
- Condition
-
-
Linker output is an executable.
-
- Relaxation
-
-
Instruction associated with
R_RISCV_TLSDESC_HI20
orR_RISCV_TLSDESC_LOAD_LO12
can be removed. -
Instruction associated with
R_RISCV_TLSDESC_ADD_LO12
can be replaced with load of the high half of the symbol’s GOT address. -
Instruction associated with
R_RISCV_TLSDESC_CALL
can be replaced with load of the low half of the symbol’s GOT address.
-
- Example
-
Relaxation candidate (
tX
andtY
can be any combination of two general purpose registers):label: auipc tX, <hi> // R_RISCV_TLSDESC_HI20 (symbol), R_RISCV_RELAX lw tY, tX, <lo> // R_RISCV_TLSDESC_LOAD_LO12 (label), R_RISCV_RELAX addi a0, tX, <lo> // R_RISCV_TLSDESC_ADD_LO12 (label), R_RISCV_RELAX jalr t0, tY // R_RISCV_TLSDESC_CALL (label), R_RISCV_RELAX
Relaxation result:
auipc a0, <pcrel-got-offset-for-symbol-hi> {ld,lw} a0, <pcrel-got-offset-for-symbol-lo>(a0)
- Target Relocation
-
R_RISCV_TLSDESC_HI20, R_RISCV_TLSDESC_LOAD_LO12, R_RISCV_TLSDESC_ADD_LO12, R_RISCV_TLSDESC_CALL
- Description
-
This relaxation can relax a sequence loading the address of a thread-local symbol reference into a thread-pointer-relative instruction sequence.
- Condition
-
-
Short form only: Offset between thread-pointer and thread-local symbol is within +-2KiB.
-
Linker output is an executable.
-
Target symbol is non-preemptible.
-
- Relaxation
-
-
Instruction associated with
R_RISCV_TLSDESC_HI20
orR_RISCV_TLSDESC_LOAD_LO12
can be removed. -
Instruction associated with
R_RISCV_TLSDESC_ADD_LO12
can be replaced with the high TP-relative offset of symbol (long form) or be removed (short form). -
Instruction associated with
R_RISCV_TLSDESC_CALL
can be replaced with the low TP-relative offset of symbol.
-
- Example
-
Relaxation candidate (
tX
andtY
can be any combination of two general purpose registers):label: auipc tX, <hi> // R_RISCV_TLSDESC_HI20 (symbol), R_RISCV_RELAX lw tY, tX, <lo> // R_RISCV_TLSDESC_LOAD_LO12 (label), R_RISCV_RELAX addi a0, tX, <lo> // R_RISCV_TLSDESC_ADD_LO12 (label), R_RISCV_RELAX jalr t0, tY // R_RISCV_TLSDESC_CALL (label), R_RISCV_RELAX
Relaxation result (long form):
lui a0, <tp-offset-for-symbol-hi> addi a0, a0, <tp-offset-for-symbol-lo>
Relaxation result (short form):
addi a0, zero, <tp-offset-for-symbol>
- Target Relocation
-
R_RISCV_CALL, R_RISCV_CALL_PLT, R_RISCV_JAL.
- Description
-
This relaxation type can relax a function call or jump instruction into a single table jump instruction with the index of the target address in table jump section (RISC-V-specific sections). Before relaxation, the linker scans all relocations and calculates whether additional gains can be obtained by using table jump instructions, where expected size saving from function-call-related relaxations and the size of jump table will be taken into account. If there is no additional gain, then table jump relaxation is ignored. Otherwise, this relaxation is switched on. Compressed Tail Call Relaxation and Compressed Function Call Relaxation are always prefered during relaxation, since table jump relaxation has no extra size saving over these two relaxations and might bring a performance overhead.
- Condition
-
The
zcmt
extension is required, the linker output is not position-independent and the rd operand of a function call or jump instruction isX0
orRA
. - Relaxation
-
-
Instruction sequence associated with
R_RISCV_CALL
orR_RISCV_CALL_PLT
can be rewritten to a table jump instruction. -
Instruction associated with
R_RISCV_JAL
can be rewritten to a table jump instruction.
-
- Example
-
Relaxation candidate:
auipc ra, 0 # R_RISCV_CALL (symbol), R_RISCV_RELAX (symbol) jalr ra, ra, 0 auipc ra, 0 # R_RISCV_CALL_PLT (symbol), R_RISCV_RELAX (symbol) jalr x0, ra, 0 jal ra, 0 # R_RISCV_JAL (symbol), R_RISCV_RELAX (symbol) jal x0, 0 # R_RISCV_JAL (symbol), R_RISCV_RELAX (symbol)
Relaxation result:
cm.jalt <index-for-symbol> cm.jt <index-for-symbol> cm.jalt <index-for-symbol>
Note
|
The zcmt extension cannot be used in position-independent binaries.
|
Note
|
Jump or call instructions with the rd operand RA will be relaxed into
cm.jalt and instructions with the rd operand X0 will be relaxed into
cm.jt . The table jump section holds target addresses for these two
instructions separately. More details are available in the ZC* extension
specification [riscv-zc-extension-group].
|
Note
|
This relaxation requires programs to initialize the jvt CSR with the
address of the __jvt_base$ symbol before executing table jump
instructions. It is recommended to initialize jvt CSR immediately after
global pointer initialization.
|
# Recommended way to initialize the jvt CSR.
1: auipc a0, %pcrel_hi(__jvt_base$)
addi a0, a0, %pcrel_lo(1b)
csrw jvt, a0
-
[gabi] "Generic System V Application Binary Interface" http://www.sco.com/developers/gabi/latest/contents.html
-
[itanium-cxx-abi] "Itanium C++ ABI" http://itanium-cxx-abi.github.io/cxx-abi/
-
[rv-asm] "RISC-V Assembly Programmer’s Manual" https://github.com/riscv-non-isa/riscv-asm-manual
-
[tls] "ELF Handling For Thread-Local Storage" https://www.akkadia.org/drepper/tls.pdf, Ulrich Drepper
-
[riscv-unpriv] "The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document", Editors Andrew Waterman and Krste Asanovi´c, RISC-V International.
-
[riscv-unpriv-20191213] "The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document release 20191213", Editors Andrew Waterman and Krste Asanovi´c, RISC-V International.
-
[riscv-zc-extension-group] "ZC* extension specification" https://github.com/riscv/riscv-code-size-reduction
-
[rvv-intrinsic-doc] "RISC-V Vector Extension Intrinsic Document" https://github.com/riscv-non-isa/rvv-intrinsic-doc
-
[rv-toolchain-conventions] "RISC-V Toolchain Conventions" https://github.com/riscv-non-isa/riscv-toolchain-conventions