Here is a table of the instruction names that are meaningful in the RTL generation pass of the compiler. Giving one of these names to an instruction pattern tells the RTL generation pass that it can use the pattern to accomplish a certain task.
If operand 0 is a subreg
with mode m of a register whose
own mode is wider than m, the effect of this instruction is
to store the specified value in the part of the register that corresponds
to mode m. Bits outside of m, but which are within the
same target word as the subreg
are undefined. Bits which are
outside the target word are left unchanged.
This class of patterns is special in several ways. First of all, each of these names up to and including full word size must be defined, because there is no other way to copy a datum from one place to another. If there are patterns accepting operands in larger modes, ‘movm’ must be defined for integer modes of those sizes.
Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate
RTL which needs no reloading and needs no temporary registers—no
registers other than the operands. For example, if you support the
pattern with a define_expand
, then in such a case the
define_expand
mustn't call force_reg
or any other such
function which might generate new pseudo registers.
This requirement exists even for subword modes on a RISC machine where fetching those modes from memory normally requires several insns and some temporary registers.
During reload a memory reference with an invalid address may be passed
as an operand. Such an address will be replaced with a valid address
later in the reload pass. In this case, nothing may be done with the
address except to use it as it stands. If it is copied, it will not be
replaced with a valid address. No attempt should be made to make such
an address into a valid address and no routine (such as
change_address
) that will do so may be called. Note that
general_operand
will fail when applied to such an address.
The global variable reload_in_progress
(which must be explicitly
declared if required) can be used to determine whether such special
handling is required.
The variety of operands that have reloads depends on the rest of the machine description, but typically on a RISC machine these can only be pseudo registers that did not get hard registers, while on other machines explicit memory references will get optional reloads.
If a scratch register is required to move an object to or from memory,
it can be allocated using gen_reg_rtx
prior to life analysis.
If there are cases which need scratch registers during or after reload, you must provide an appropriate secondary_reload target hook.
The macro can_create_pseudo_p
can be used to determine if it
is unsafe to create new pseudo registers. If this variable is nonzero, then
it is unsafe to call gen_reg_rtx
to allocate a new pseudo.
The constraints on a ‘movm’ must permit moving any hard
register to any other hard register provided that
HARD_REGNO_MODE_OK
permits mode m in both registers and
TARGET_REGISTER_MOVE_COST
applied to their classes returns a value
of 2.
It is obligatory to support floating point ‘movm’
instructions into and out of any registers that can hold fixed point
values, because unions and structures (which have modes SImode
or
DImode
) can be in those registers and they may have floating
point members.
There may also be a need to support fixed point ‘movm’
instructions in and out of floating point registers. Unfortunately, I
have forgotten why this was so, and I don't know whether it is still
true. If HARD_REGNO_MODE_OK
rejects fixed point values in
floating point registers, then the constraints of the fixed point
‘movm’ instructions must be designed to avoid ever trying to
reload into a floating point register.
secondary_reload
.
Like ‘movm’, but used when a scratch register is required to
move between operand 0 and operand 1. Operand 2 describes the scratch
register. See the discussion of the SECONDARY_RELOAD_CLASS
macro in see Register Classes.
There are special restrictions on the form of the match_operand
s
used in these patterns. First, only the predicate for the reload
operand is examined, i.e., reload_in
examines operand 1, but not
the predicates for operand 0 or 2. Second, there may be only one
alternative in the constraints. Third, only a single register class
letter may be used for the constraint; subsequent constraint letters
are ignored. As a special exception, an empty constraint string
matches the ALL_REGS
register class. This may relieve ports
of the burden of defining an ALL_REGS
constraint letter just
for these patterns.
subreg
with mode m of a register whose natural mode is wider,
the ‘movstrictm’ instruction is guaranteed not to alter
any of the register except the part which belongs to mode m.
This pattern is used by the autovectorizer, and when expanding a
MISALIGNED_INDIRECT_REF
expression.
Define this only if the target machine really has such an instruction; do not define this if the most efficient way of loading consecutive registers from memory is to do them one at a time.
On some machines, there are restrictions as to which consecutive
registers can be stored into memory, such as particular starting or
ending register numbers or only a range of valid counts. For those
machines, use a define_expand
(see Expander Definitions)
and make the pattern fail if the restrictions are not met.
Write the generated insn as a parallel
with elements being a
set
of one register from the appropriate memory location (you may
also need use
or clobber
elements). Use a
match_parallel
(see RTL Template) to recognize the insn. See
rs6000.md for examples of the use of this insn pattern.
int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) for (i = 0; i < c; i++) operand0[i][j] = operand1[j * c + i];
For example, ‘vec_load_lanestiv4hi’ loads 8 16-bit values from memory into a register of mode ‘TI’. The register contains two consecutive vectors of mode ‘V4HI’.
This pattern can only be used if:
TARGET_ARRAY_MODE_SUPPORTED_P (n, c)
is true. GCC assumes that, if a target supports this kind of instruction for some mode n, it also supports unaligned loads for vectors of mode n.
int c = GET_MODE_SIZE (m) / GET_MODE_SIZE (n); for (j = 0; j < GET_MODE_NUNITS (n); j++) for (i = 0; i < c; i++) operand0[j * c + i] = operand1[i][j];
for a memory operand 0 and register operand 1.
The input elements are numbered from 0 in operand 1 through
2*N-1 in operand 2. The elements of the selector must
be computed modulo 2*N. Note that if
rtx_equal_p(operand1, operand2)
, this can be implemented
with just operand 1 and selector elements modulo N.
In order to make things easy for a number of targets, if there is no
‘vec_perm’ pattern for mode m, but there is for mode q
where q is a vector of QImode
of the same width as m,
the middle-end will lower the mode m VEC_PERM_EXPR
to
mode q.
CONST_VECTOR
.
Some targets cannot perform a permutation with a variable selector,
but can efficiently perform a constant permutation. Further, the
target hook vec_perm_ok
is queried to determine if the
specific constant permutation is available efficiently; the named
pattern is never expanded without vec_perm_ok
returning true.
There is no need for a target to supply both ‘vec_permm’ and ‘vec_perm_constm’ if the former can trivially implement the operation with, say, the vector constant loaded into a register.
PUSH_ROUNDING
is defined. For historical reason, this pattern may be
missing and in such case an mov
expander is used instead, with a
MEM
expression forming the push operation. The mov
expander
method is deprecated.
add
m3
but takes a code_label
as operand 3 and
emits code to jump to it if signed overflow occurs during the addition.
This pattern is used to implement the built-in functions performing
signed integer addition with overflow checking.
mulv
m4
but for unsigned multiplication. That is to
say, the operation is the same as signed multiplication but the jump
is taken only on unsigned overflow.
add
m3
but is guaranteed to only be used for address
calculations. The expanded code is not allowed to clobber the
condition code. It only needs to be defined if add
m3
sets the condition code. If adds used for address calculations and
normal adds are not compatible it is required to expand a distinct
pattern (e.g. using an unspec). The pattern is used by LRA to emit
address calculations. add
m3
is used if
addptr
m3
is not defined.
fma
, fmaf
, and fmal
builtin functions from
the ISO C99 standard.
fma
m4
, except operand 3 subtracted from the
product instead of added to the product. This is represented
in the rtl as
(fma:m op1 op2 (neg:m op3))
fma
m4
except that the intermediate product
is negated before being added to operand 3. This is represented
in the rtl as
(fma:m (neg:m op1) op2 op3)
fms
m4
except that the intermediate product
is negated before subtracting operand 3. This is represented
in the rtl as
(fma:m (neg:m op1) op2 (neg:m op3))
NaN
, then
it is unspecified which of the two operands is returned as the result.
HImode
, and store
a SImode
product in operand 0.
In other words, madd
mn4
is like
mul
mn3
except that it also adds operand 3.
These instructions are not allowed to FAIL
.
madd
mn4
, but zero-extend the multiplication
operands instead of sign-extending them.
madd
mn4
, but all involved operations must be
signed-saturating.
umadd
mn4
, but all involved operations must be
unsigned-saturating.
In other words, msub
mn4
is like
mul
mn3
except that it also subtracts the result
from operand 3.
These instructions are not allowed to FAIL
.
msub
mn4
, but zero-extend the multiplication
operands instead of sign-extending them.
msub
mn4
, but all involved operations must be
signed-saturating.
umsub
mn4
, but all involved operations must be
unsigned-saturating.
For machines with an instruction that produces both a quotient and a remainder, provide a pattern for ‘divmodm4’ but do not provide patterns for ‘divm3’ and ‘modm3’. This allows optimization in the relatively common case when both the quotient and remainder are computed.
If an instruction that just produces a quotient or just a remainder
exists and is more efficient than the instruction that produces both,
write the output routine of ‘divmodm4’ to call
find_reg_note
and look for a REG_UNUSED
note on the
quotient or remainder and generate the appropriate instruction.
TARGET_SHIFT_TRUNCATION_MASK
.
See TARGET_SHIFT_TRUNCATION_MASK. Operand 2 is always a scalar type.
ashl
m3
instructions. Operand 2 is always a scalar type.
neg
m2
but takes a code_label
as operand 2 and
emits code to jump to it if signed overflow occurs during the negation.
The sqrt
built-in function of C always uses the mode which
corresponds to the C data type double
and the sqrtf
built-in function uses the mode which corresponds to the C data
type float
.
The fmod
built-in function of C always uses the mode which
corresponds to the C data type double
and the fmodf
built-in function uses the mode which corresponds to the C data
type float
.
The remainder
built-in function of C always uses the mode
which corresponds to the C data type double
and the
remainderf
built-in function uses the mode which corresponds
to the C data type float
.
The cos
built-in function of C always uses the mode which
corresponds to the C data type double
and the cosf
built-in function uses the mode which corresponds to the C data
type float
.
The sin
built-in function of C always uses the mode which
corresponds to the C data type double
and the sinf
built-in function uses the mode which corresponds to the C data
type float
.
The sin
and cos
built-in functions of C always use the
mode which corresponds to the C data type double
and the
sinf
and cosf
built-in function use the mode which
corresponds to the C data type float
.
Targets that can calculate the sine and cosine simultaneously can
implement this pattern as opposed to implementing individual
sin
m2
and cos
m2
patterns. The sin
and cos
built-in functions will then be expanded to the
sincos
m3
pattern, with one of the output values
left unused.
The exp
built-in function of C always uses the mode which
corresponds to the C data type double
and the expf
built-in function uses the mode which corresponds to the C data
type float
.
The log
built-in function of C always uses the mode which
corresponds to the C data type double
and the logf
built-in function uses the mode which corresponds to the C data
type float
.
The pow
built-in function of C always uses the mode which
corresponds to the C data type double
and the powf
built-in function uses the mode which corresponds to the C data
type float
.
The atan2
built-in function of C always uses the mode which
corresponds to the C data type double
and the atan2f
built-in function uses the mode which corresponds to the C data
type float
.
The floor
built-in function of C always uses the mode which
corresponds to the C data type double
and the floorf
built-in function uses the mode which corresponds to the C data
type float
.
The trunc
built-in function of C always uses the mode which
corresponds to the C data type double
and the truncf
built-in function uses the mode which corresponds to the C data
type float
.
The round
built-in function of C always uses the mode which
corresponds to the C data type double
and the roundf
built-in function uses the mode which corresponds to the C data
type float
.
The ceil
built-in function of C always uses the mode which
corresponds to the C data type double
and the ceilf
built-in function uses the mode which corresponds to the C data
type float
.
The nearbyint
built-in function of C always uses the mode which
corresponds to the C data type double
and the nearbyintf
built-in function uses the mode which corresponds to the C data
type float
.
The rint
built-in function of C always uses the mode which
corresponds to the C data type double
and the rintf
built-in function uses the mode which corresponds to the C data
type float
.
The copysign
built-in function of C always uses the mode which
corresponds to the C data type double
and the copysignf
built-in function uses the mode which corresponds to the C data
type float
.
The ffs
built-in function of C always uses the mode which
corresponds to the C data type int
.
CLZ_DEFINED_VALUE_AT_ZERO
(see Misc) macro defines if
the result is undefined or has a useful value.
m is the mode of operand 0; operand 1's mode is
specified by the instruction pattern, and the compiler will convert the
operand to that mode before generating the instruction.
CTZ_DEFINED_VALUE_AT_ZERO
(see Misc) macro defines if
the result is undefined or has a useful value.
m is the mode of operand 0; operand 1's mode is
specified by the instruction pattern, and the compiler will convert the
operand to that mode before generating the instruction.
mem:BLK
s with an
address in mode Pmode
.
The number of bytes to move is the third operand, in mode m.
Usually, you specify Pmode
for m. However, if you can
generate better code knowing the range of valid lengths is smaller than
those representable in a full Pmode pointer, you should provide
a pattern with a
mode corresponding to the range of values you can handle efficiently
(e.g., QImode
for values in the range 0–127; note we avoid numbers
that appear negative) and also a pattern with Pmode
.
The fourth operand is the known shared alignment of the source and
destination, in the form of a const_int
rtx. Thus, if the
compiler knows that both source and destination are word-aligned,
it may provide the value 4 for this operand.
Optional operands 5 and 6 specify expected alignment and size of block
respectively. The expected alignment differs from alignment in operand 4
in a way that the blocks are not required to be aligned according to it in
all cases. This expected alignment is also in bytes, just like operand 4.
Expected size, when unknown, is set to (const_int -1)
.
Descriptions of multiple movmem
m patterns can only be
beneficial if the patterns for smaller modes have fewer restrictions
on their first, second and fourth operands. Note that the mode m
in movmem
m does not impose any restriction on the mode of
individually moved data units in the block.
These patterns need not give special consideration to the possibility that the source and destination strings might overlap.
stpcpy
semantics. Operand 0 is
an output operand in mode Pmode
. The addresses of the
destination and source strings are operands 1 and 2, and both are
mem:BLK
s with addresses in mode Pmode
. The execution of
the expansion of this pattern should store in operand 0 the address in
which the NUL
terminator was stored in the destination string.
This patern has also several optional operands that are same as in
setmem
.
mem:BLK
whose address is in mode Pmode
. The
number of bytes to set is the second operand, in mode m. The value to
initialize the memory with is the third operand. Targets that only support the
clearing of memory should reject any value that is not the constant 0. See
‘movmemm’ for a discussion of the choice of mode.
The fourth operand is the known alignment of the destination, in the form
of a const_int
rtx. Thus, if the compiler knows that the
destination is word-aligned, it may provide the value 4 for this
operand.
Optional operands 5 and 6 specify expected alignment and size of block
respectively. The expected alignment differs from alignment in operand 4
in a way that the blocks are not required to be aligned according to it in
all cases. This expected alignment is also in bytes, just like operand 4.
Expected size, when unknown, is set to (const_int -1)
.
Operand 7 is the minimal size of the block and operand 8 is the
maximal size of the block (NULL if it can not be represented as CONST_INT).
Operand 9 is the probable maximal size (i.e. we can not rely on it for correctness,
but it can be used for choosing proper code sequence for a given size).
The use for multiple setmem
m is as for movmem
m.
mem:BLK
with an address in mode
Pmode
.
The fourth operand is the known shared alignment of the source and
destination, in the form of a const_int
rtx. Thus, if the
compiler knows that both source and destination are word-aligned,
it may provide the value 4 for this operand.
The two memory blocks specified are compared byte by byte in lexicographic order starting at the beginning of each string. The instruction is not allowed to prefetch more than one byte at a time since either string may end in the first byte and reading past that may access an invalid page or segment and cause a fault. The comparison will terminate when the fetched bytes are different or if they are equal to zero. The effect of the instruction is to store a value in operand 0 whose sign indicates the result of the comparison.
mem
referring to the first character of the string,
operand 2 is the character to search for (normally zero),
and operand 3 is a constant describing the known alignment
of the beginning of the string.
If the machine description defines this pattern, it also needs to
define the ftrunc
pattern.
Operands 0 and 1 both have mode m. Operands 2 and 3 have a target-specific mode.
Operand 0 has mode m while operand 1 has BLK
mode.
Operands 2 and 3 have a target-specific mode.
The instruction must not read beyond the last byte of the bit-field.
Operands 0 and 3 both have mode m. Operands 1 and 2 have a target-specific mode.
Operand 3 has mode m while operand 0 has BLK
mode.
Operands 1 and 2 have a target-specific mode.
The instruction must not read or write beyond the last byte of the bit-field.
word_mode
.
Operand 1 may have mode byte_mode
or word_mode
; often
word_mode
is allowed only for registers. Operands 2 and 3 must
be valid for word_mode
.
The RTL generation pass generates this instruction only with constants for operands 2 and 3 and the constant is never zero for operand 2.
The bit-field value is sign-extended to a full word integer before it is stored in operand 0.
This pattern is deprecated; please use ‘extvm’ and
extvmisalign
m instead.
This pattern is deprecated; please use ‘extzvm’ and
extzvmisalign
m instead.
word_mode
) into a
bit-field in operand 0, where operand 1 specifies the width in bits and
operand 2 the starting bit. Operand 0 may have mode byte_mode
or
word_mode
; often word_mode
is allowed only for registers.
Operands 1 and 2 must be valid for word_mode
.
The RTL generation pass generates this instruction only with constants for operands 1 and 2 and the constant is never zero for operand 1.
This pattern is deprecated; please use ‘insvm’ and
insvmisalign
m instead.
The mode of the operands being compared need not be the same as the operands being moved. Some machines, sparc64 for example, have instructions that conditionally move an integer value based on the floating point condition codes and vice versa.
If the machine does not have conditional move instructions, do not define these patterns.
match_operand
expression. The compiler automatically sees which
mode you have used and supplies an operand of that mode.
The value stored for a true condition must have 1 as its low bit, or
else must be negative. Otherwise the instruction is not suitable and
you should omit it from the machine description. You describe to the
compiler exactly which value is stored by defining the macro
STORE_FLAG_VALUE
(see Misc). If a description cannot be
found that can be used for all the possible comparison operators, you
should pick one and use a define_expand
to map all results
onto the one you chose.
These operations may FAIL
, but should do so only in relatively
uncommon cases; if they would FAIL
for common cases involving
integer comparisons, it is best to restrict the predicates to not
allow these operands. Likewise if a given comparison operator will
always fail, independent of the operands (for floating-point modes, the
ordered_comparison_operator
predicate is often useful in this case).
If this pattern is omitted, the compiler will generate a conditional
branch—for example, it may copy a constant one to the target and branching
around an assignment of zero to the target—or a libcall. If the predicate
for operand 1 only rejects some operators, it will also try reordering the
operands and/or inverting the result value (e.g. by an exclusive OR).
These possibilities could be cheaper or equivalent to the instructions
used for the ‘cstoremode4’ pattern followed by those required
to convert a positive result from STORE_FLAG_VALUE
to 1; in this
case, you can and should make operand 1's predicate reject some operators
in the ‘cstoremode4’ pattern, or remove the pattern altogether
from the machine description.
code_label
to jump to.
code_label
to jump to. This pattern name is mandatory on all
machines.
const_int
; operand 2 is the number of registers used as
operands.
On most machines, operand 2 is not actually stored into the RTL pattern. It is supplied for the sake of some RISC machines which need to put this information into the assembler code; they can put it in the RTL instead of operand 1.
Operand 0 should be a mem
RTX whose address is the address of the
function. Note, however, that this address can be a symbol_ref
expression even if it would not be a legitimate memory address on the
target machine. If it is also not a valid argument for a call
instruction, the pattern for this operation should be a
define_expand
(see Expander Definitions) that places the
address into a register and uses that register in the call instruction.
Subroutines that return BLKmode
objects use the ‘call’
insn.
RETURN_POPS_ARGS
is nonzero. They should emit a parallel
that contains both the function call and a set
to indicate the
adjustment made to the frame pointer.
For machines where RETURN_POPS_ARGS
can be nonzero, the use of these
patterns increases the number of functions for which the frame pointer
can be eliminated, if desired.
parallel
expression where each element is a set
expression that indicates
the saving of a function return value into the result block.
This instruction pattern should be defined to support
__builtin_apply
on machines where special instructions are needed
to call a subroutine with arbitrary arguments or to save the value
returned. This instruction pattern is required on machines that have
multiple registers that can hold a return value
(i.e. FUNCTION_VALUE_REGNO_P
is true for more than one register).
Like the ‘movm’ patterns, this pattern is also used after the RTL generation phase. In this case it is to support machines where multiple instructions are usually needed to return from a function, but some class of functions only requires one instruction to implement a return. Normally, the applicable functions are those which do not need to save any registers or allocate stack space.
It is valid for this pattern to expand to an instruction using
simple_return
if no epilogue is required.
return
instruction pattern, but it is emitted
only by the shrink-wrapping optimization on paths where the function
prologue has not been executed, and a function return should occur without
any of the effects of the epilogue. Additional uses may be introduced on
paths where both the prologue and the epilogue have executed.
For such machines, the condition specified in this pattern should only
be true when reload_completed
is nonzero and the function's
epilogue would only be a single instruction. For machines with register
windows, the routine leaf_function_p
may be used to determine if
a register window push is required.
Machines that have conditional return instructions should define patterns such as
(define_insn "" [(set (pc) (if_then_else (match_operator 0 "comparison_operator" [(cc0) (const_int 0)]) (return) (pc)))] "condition" "...")
where condition would normally be the same condition specified on the named ‘return’ pattern.
__builtin_return
on machines where special
instructions are needed to return a value of any type.
Operand 0 is a memory location where the result of calling a function
with __builtin_apply
is stored; operand 1 is a parallel
expression where each element is a set
expression that indicates
the restoring of a function return value from the result block.
(const_int 0)
will do as an
RTL pattern.
SImode
.
The table is an addr_vec
or addr_diff_vec
inside of a
jump_table_data
. The number of elements in the table is one plus the
difference between the upper bound and the lower bound.
This pattern requires two operands: the address or offset, and a label
which should immediately precede the jump table. If the macro
CASE_VECTOR_PC_RELATIVE
evaluates to a nonzero value then the first
operand is an offset which counts from the address of the table; otherwise,
it is an absolute address to jump to. In either case, the first operand has
mode Pmode
.
The ‘tablejump’ insn is always the last insn before the jump table it uses. Its assembler code normally has no need to use the second operand, but you should incorporate it in the RTL pattern so that the jump optimizer will not delete the table as unreachable code.
This optional instruction pattern is only used by the combiner, typically for loops reversed by the loop optimizer when strength reduction is enabled.
This optional instruction pattern should be defined for machines with
low-overhead looping instructions as the loop optimizer will try to
modify suitable loops to utilize it. The target hook
TARGET_CAN_USE_DOLOOP_P
controls the conditions under which
low-overhead loops can be used.
doloop_end
required for machines that
need to perform some initialization, such as loading a special counter
register. Operand 1 is the associated doloop_end
pattern and
operand 0 is the register that it decrements.
If initialization insns do not always need to be emitted, use a
define_expand
(see Expander Definitions) and make it fail.
Operand 0 is always a reg
and has mode Pmode
; operand 1
may be a reg
, mem
, symbol_ref
, const_int
, etc
and also has mode Pmode
.
Canonicalization of a function pointer usually involves computing the address of the function which would be called if the function pointer were used in an indirect call.
Only define this pattern if function pointers on the target machine can have different values but still call the same function when used in an indirect call.
Pmode
. Do not define these patterns on
such machines.
Some machines require special handling for stack pointer saves and
restores. On those machines, define the patterns corresponding to the
non-standard cases by using a define_expand
(see Expander Definitions) that produces the required insns. The three types of
saves and restores are:
alloca
. Only
the epilogue uses the restored stack pointer, allowing a simpler save or
restore sequence on some machines.
When saving the stack pointer, operand 0 is the save area and operand 1
is the stack pointer. The mode used to allocate the save area defaults
to Pmode
but you can override that choice by defining the
STACK_SAVEAREA_MODE
macro (see Storage Layout). You must
specify an integral mode, or VOIDmode
if no save area is needed
for a particular type of save (either because no save is needed or
because a machine-specific save area can be used). Operand 0 is the
stack pointer and operand 1 is the save area for restore operations. If
‘save_stack_block’ is defined, operand 0 must not be
VOIDmode
since these saves can be arbitrarily nested.
A save area is a mem
that is at a constant offset from
virtual_stack_vars_rtx
when the stack pointer is saved for use by
nonlocal gotos and a reg
in the other two cases.
STACK_GROWS_DOWNWARD
is undefined) operand 1 from
the stack pointer to create space for dynamically allocated data.
Store the resultant pointer to this space into operand 0. If you
are allocating space from the main stack, do this by emitting a
move insn to copy virtual_stack_dynamic_rtx
to operand 0.
If you are allocating the space elsewhere, generate code to copy the
location of the space to operand 0. In the latter case, you must
ensure this space gets freed when the corresponding space on the main
stack is free.
Do not define this pattern if all that must be done is the subtraction. Some machines require other operations such as stack probes or maintaining the back chain. Define this pattern to emit those operations in addition to updating the stack pointer.
On most machines you need not define this pattern, since GCC will already generate the correct code, which is to load the frame pointer and static chain, restore the stack (using the ‘restore_stack_nonlocal’ pattern, if defined), and jump indirectly to the dispatcher. You need only define this pattern if this code will not work on your machine.
jmp_buf
. You will not normally need to define this pattern.
A typical reason why you might need this pattern is if some value, such
as a pointer to a global table, must be restored. Though it is
preferred that the pointer value be recalculated if possible (given the
address of a label for instance). The single argument is a pointer to
the jmp_buf
. Note that the buffer is five words long and that
the first three are normally used by the generic mechanism.
builtin_setjmp_setup
. The single argument is a pointer to the
jmp_buf
.
__builtin_eh_return
,
and thence the call frame exception handling library routines, are
built. It is intended to handle non-trivial actions needed along
the abnormal return path.
The address of the exception handler to which the function should return
is passed as operand to this pattern. It will normally need to copied by
the pattern to some special register or memory location.
If the pattern needs to determine the location of the target call
frame in order to do so, it may use EH_RETURN_STACKADJ_RTX
,
if defined; it will have already been assigned.
If this pattern is not defined, the default action will be to simply
copy the return address to EH_RETURN_HANDLER_RTX
. Either
that macro or this pattern needs to be defined if call frame exception
handling is to be used.
Using a prologue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_PROLOGUE
to emit assembly code for the prologue.
The prologue
pattern is particularly useful for targets which perform
instruction scheduling.
Using an epilogue pattern is generally preferred over defining
TARGET_ASM_FUNCTION_EPILOGUE
to emit assembly code for the epilogue.
The epilogue
pattern is particularly useful for targets which perform
instruction scheduling or which have delay slots for their return instruction.
The sibcall_epilogue
pattern must not clobber any arguments used for
parameter passing or any stack slots for arguments passed to the current
function.
A typical ctrap
pattern looks like
(define_insn "ctrapsi4" [(trap_if (match_operator 0 "trap_operator" [(match_operand 1 "register_operand") (match_operand 2 "immediate_operand")]) (match_operand 3 "const_int_operand" "i"))] "" "...")
Targets that do not support write prefetches or locality hints can ignore the values of operands 1 and 2.
This pattern must show that both operand 0 and operand 1 are modified.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
For targets where the success or failure of the compare-and-swap
operation is available via the status flags, it is possible to
avoid a separate compare operation and issue the subsequent
branch or store-flag operation immediately after the compare-and-swap.
To this end, GCC will look for a MODE_CC
set in the
output of sync_compare_and_swap
mode; if the machine
description includes such a set, the target should also define special
cbranchcc4
and/or cstorecc4
instructions. GCC will then
be able to take the destination of the MODE_CC
set and pass it
to the cbranchcc4
or cstorecc4
pattern as the first
operand of the comparison (the second will be (const_int 0)
).
For targets where the operating system may provide support for this
operation via library calls, the sync_compare_and_swap_optab
may be initialized to a function with the same interface as the
__sync_val_compare_and_swap_
n built-in. If the entire
set of __sync builtins are supported via library calls, the
target can initialize all of the optabs at once with
init_sync_libfuncs
.
For the purposes of C++11 std::atomic::is_lock_free
, it is
assumed that these library calls do not use any kind of
interruptable locking.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
If these patterns are not defined, the operation will be constructed from a compare-and-swap operation, if defined.
This pattern must issue any memory barrier instructions such that all memory operations before the atomic operation occur before the atomic operation and all memory operations after the atomic operation occur after the atomic operation.
If these patterns are not defined, the operation will be constructed from a compare-and-swap operation, if defined.
sync_old_
op counterparts,
except that they return the value that exists in the memory location
after the operation, rather than before the operation.
In the ideal case, this operation is an atomic exchange operation, in which the previous value in memory operand is copied into the result operand, and the value operand is stored in the memory operand.
For less capable targets, any value operand that is not the constant 1
should be rejected with FAIL
. In this case the target may use
an atomic test-and-set bit operation. The result operand should contain
1 if the bit was previously set and 0 if the bit was previously clear.
The true contents of the memory operand are implementation defined.
This pattern must issue any memory barrier instructions such that the pattern as a whole acts as an acquire barrier, that is all memory operations after the pattern do not occur until the lock is acquired.
If this pattern is not defined, the operation will be constructed from a compare-and-swap operation, if defined.
sync_lock_test_and_set
mode. Operand 0 is the memory
that contains the lock; operand 1 is the value to store in the lock.
If the target doesn't implement full semantics for
sync_lock_test_and_set
mode, any value operand which is not
the constant 0 should be rejected with FAIL
, and the true contents
of the memory operand are implementation defined.
This pattern must issue any memory barrier instructions such that the pattern as a whole acts as a release barrier, that is the lock is released only after all previous memory operations have completed.
If this pattern is not defined, then a memory_barrier
pattern
will be emitted, followed by a store of the value to the memory operand.
If memory referred to in operand 2 contains the value in operand 3, then operand 4 is stored in memory pointed to by operand 2 and fencing based on the memory model in operand 6 is issued.
If memory referred to in operand 2 does not contain the value in operand 3, then fencing based on the memory model in operand 7 is issued.
If a target does not support weak compare-and-swap operations, or the port elects not to implement weak operations, the argument in operand 5 can be ignored. Note a strong implementation must be provided.
If this pattern is not provided, the __atomic_compare_exchange
built-in functions will utilize the legacy sync_compare_and_swap
pattern with an __ATOMIC_SEQ_CST
memory model.
If not present, the __atomic_load
built-in function will either
resort to a normal load with memory barriers, or a compare-and-swap
operation if a normal load would not be atomic.
If not present, the __atomic_store
built-in function will attempt to
perform a normal store and surround it with any required memory fences. If
the store would not be atomic, then an __atomic_exchange
is
attempted with the result being ignored.
If this pattern is not present, the built-in function
__atomic_exchange
will attempt to preform the operation with a
compare and swap loop.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns, or equivalent patterns which return a result. If
none of these are available a compare-and-swap loop will be used.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns. If none of these are available a compare-and-swap
loop will be used.
If these patterns are not defined, attempts will be made to use legacy
sync
patterns, or equivalent patterns which return the result before
the operation followed by the arithmetic operation required to produce the
result. If none of these are available a compare-and-swap loop will be
used.
__builtin_atomic_test_and_set
.
Operand 0 is an output operand which is set to true if the previous
previous contents of the byte was "set", and false otherwise. Operand 1
is the QImode
memory to be modified. Operand 2 is the memory
model to be used.
The specific value that defines "set" is implementation defined, and is normally based on what is performed by the native atomic test and set instruction.
If this pattern is not specified, all memory models except
__ATOMIC_RELAXED
will result in issuing a sync_synchronize
barrier pattern.
This pattern should impact the compiler optimizers the same way that mem_signal_fence does, but it does not need to issue any barrier instructions.
If this pattern is not specified, all memory models except
__ATOMIC_RELAXED
will result in issuing a sync_synchronize
barrier pattern.
__builtin_thread_pointer
and __builtin_set_thread_pointer
builtins.
The get/set patterns have a single output/input operand respectively,
with mode intended to be Pmode
.
ptr_mode
value from the memory
in operand 1 to the memory in operand 0 without leaving the value in
a register afterward. This is to avoid leaking the value some place
that an attacker might use to rewrite the stack guard slot after
having clobbered it.
If this pattern is not defined, then a plain move pattern is generated.
ptr_mode
value from the
memory in operand 1 with the memory in operand 0 without leaving the
value in a register afterward and branches to operand 2 if the values
were equal.
If this pattern is not defined, then a plain compare pattern and conditional branch pattern is used.
If this pattern is not defined, a call to the library function
__clear_cache
is used.