Next: , Previous: , Up: Submodel Options   [Contents][Index]


3.18.3 ARC Options

The following options control the architecture variant for which code is being compiled:

-mbarrel-shifter

Generate instructions supported by barrel shifter. This is the default unless -mcpu=ARC601 or ‘-mcpu=ARCEM’ is in effect.

-mjli-always

Force to call a function using jli_s instruction. This option is valid only for ARCv2 architecture.

-mcpu=cpu

Set architecture type, register usage, and instruction scheduling parameters for cpu. There are also shortcut alias options available for backward compatibility and convenience. Supported values for cpu are

arc600

Compile for ARC600. Aliases: -mA6, -mARC600.

arc601

Compile for ARC601. Alias: -mARC601.

arc700

Compile for ARC700. Aliases: -mA7, -mARC700. This is the default when configured with --with-cpu=arc700.

arcem

Compile for ARC EM.

archs

Compile for ARC HS.

em

Compile for ARC EM CPU with no hardware extensions.

em4

Compile for ARC EM4 CPU.

em4_dmips

Compile for ARC EM4 DMIPS CPU.

em4_fpus

Compile for ARC EM4 DMIPS CPU with the single-precision floating-point extension.

em4_fpuda

Compile for ARC EM4 DMIPS CPU with single-precision floating-point and double assist instructions.

hs

Compile for ARC HS CPU with no hardware extensions except the atomic instructions.

hs34

Compile for ARC HS34 CPU.

hs38

Compile for ARC HS38 CPU.

hs38_linux

Compile for ARC HS38 CPU with all hardware extensions on.

arc600_norm

Compile for ARC 600 CPU with norm instructions enabled.

arc600_mul32x16

Compile for ARC 600 CPU with norm and 32x16-bit multiply instructions enabled.

arc600_mul64

Compile for ARC 600 CPU with norm and mul64-family instructions enabled.

arc601_norm

Compile for ARC 601 CPU with norm instructions enabled.

arc601_mul32x16

Compile for ARC 601 CPU with norm and 32x16-bit multiply instructions enabled.

arc601_mul64

Compile for ARC 601 CPU with norm and mul64-family instructions enabled.

nps400

Compile for ARC 700 on NPS400 chip.

em_mini

Compile for ARC EM minimalist configuration featuring reduced register set.

-mdpfp
-mdpfp-compact

Generate double-precision FPX instructions, tuned for the compact implementation.

-mdpfp-fast

Generate double-precision FPX instructions, tuned for the fast implementation.

-mno-dpfp-lrsr

Disable lr and sr instructions from using FPX extension aux registers.

-mea

Generate extended arithmetic instructions. Currently only divaw, adds, subs, and sat16 are supported. This is always enabled for -mcpu=ARC700.

-mno-mpy

Do not generate mpy-family instructions for ARC700. This option is deprecated.

-mmul32x16

Generate 32x16-bit multiply and multiply-accumulate instructions.

-mmul64

Generate mul64 and mulu64 instructions. Only valid for -mcpu=ARC600.

-mnorm

Generate norm instructions. This is the default if -mcpu=ARC700 is in effect.

-mspfp
-mspfp-compact

Generate single-precision FPX instructions, tuned for the compact implementation.

-mspfp-fast

Generate single-precision FPX instructions, tuned for the fast implementation.

-msimd

Enable generation of ARC SIMD instructions via target-specific builtins. Only valid for -mcpu=ARC700.

-msoft-float

This option ignored; it is provided for compatibility purposes only. Software floating-point code is emitted by default, and this default can overridden by FPX options; -mspfp, -mspfp-compact, or -mspfp-fast for single precision, and -mdpfp, -mdpfp-compact, or -mdpfp-fast for double precision.

-mswap

Generate swap instructions.

-matomic

This enables use of the locked load/store conditional extension to implement atomic memory built-in functions. Not available for ARC 6xx or ARC EM cores.

-mdiv-rem

Enable div and rem instructions for ARCv2 cores.

-mcode-density

Enable code density instructions for ARC EM. This option is on by default for ARC HS.

-mll64

Enable double load/store operations for ARC HS cores.

-mtp-regno=regno

Specify thread pointer register number.

-mmpy-option=multo

Compile ARCv2 code with a multiplier design option. You can specify the option using either a string or numeric value for multo. ‘wlh1’ is the default value. The recognized values are:

0
none

No multiplier available.

1
w

16x16 multiplier, fully pipelined. The following instructions are enabled: mpyw and mpyuw.

2
wlh1

32x32 multiplier, fully pipelined (1 stage). The following instructions are additionally enabled: mpy, mpyu, mpym, mpymu, and mpy_s.

3
wlh2

32x32 multiplier, fully pipelined (2 stages). The following instructions are additionally enabled: mpy, mpyu, mpym, mpymu, and mpy_s.

4
wlh3

Two 16x16 multipliers, blocking, sequential. The following instructions are additionally enabled: mpy, mpyu, mpym, mpymu, and mpy_s.

5
wlh4

One 16x16 multiplier, blocking, sequential. The following instructions are additionally enabled: mpy, mpyu, mpym, mpymu, and mpy_s.

6
wlh5

One 32x4 multiplier, blocking, sequential. The following instructions are additionally enabled: mpy, mpyu, mpym, mpymu, and mpy_s.

7
plus_dmpy

ARC HS SIMD support.

8
plus_macd

ARC HS SIMD support.

9
plus_qmacw

ARC HS SIMD support.

This option is only available for ARCv2 cores.

-mfpu=fpu

Enables support for specific floating-point hardware extensions for ARCv2 cores. Supported values for fpu are:

fpus

Enables support for single-precision floating-point hardware extensions.

fpud

Enables support for double-precision floating-point hardware extensions. The single-precision floating-point extension is also enabled. Not available for ARC EM.

fpuda

Enables support for double-precision floating-point hardware extensions using double-precision assist instructions. The single-precision floating-point extension is also enabled. This option is only available for ARC EM.

fpuda_div

Enables support for double-precision floating-point hardware extensions using double-precision assist instructions. The single-precision floating-point, square-root, and divide extensions are also enabled. This option is only available for ARC EM.

fpuda_fma

Enables support for double-precision floating-point hardware extensions using double-precision assist instructions. The single-precision floating-point and fused multiply and add hardware extensions are also enabled. This option is only available for ARC EM.

fpuda_all

Enables support for double-precision floating-point hardware extensions using double-precision assist instructions. All single-precision floating-point hardware extensions are also enabled. This option is only available for ARC EM.

fpus_div

Enables support for single-precision floating-point, square-root and divide hardware extensions.

fpud_div

Enables support for double-precision floating-point, square-root and divide hardware extensions. This option includes option ‘fpus_div’. Not available for ARC EM.

fpus_fma

Enables support for single-precision floating-point and fused multiply and add hardware extensions.

fpud_fma

Enables support for double-precision floating-point and fused multiply and add hardware extensions. This option includes option ‘fpus_fma’. Not available for ARC EM.

fpus_all

Enables support for all single-precision floating-point hardware extensions.

fpud_all

Enables support for all single- and double-precision floating-point hardware extensions. Not available for ARC EM.

-mirq-ctrl-saved=register-range, blink, lp_count

Specifies general-purposes registers that the processor automatically saves/restores on interrupt entry and exit. register-range is specified as two registers separated by a dash. The register range always starts with r0, the upper limit is fp register. blink and lp_count are optional. This option is only valid for ARC EM and ARC HS cores.

-mrgf-banked-regs=number

Specifies the number of registers replicated in second register bank on entry to fast interrupt. Fast interrupts are interrupts with the highest priority level P0. These interrupts save only PC and STATUS32 registers to avoid memory transactions during interrupt entry and exit sequences. Use this option when you are using fast interrupts in an ARC V2 family processor. Permitted values are 4, 8, 16, and 32.

-mlpc-width=width

Specify the width of the lp_count register. Valid values for width are 8, 16, 20, 24, 28 and 32 bits. The default width is fixed to 32 bits. If the width is less than 32, the compiler does not attempt to transform loops in your program to use the zero-delay loop mechanism unless it is known that the lp_count register can hold the required loop-counter value. Depending on the width specified, the compiler and run-time library might continue to use the loop mechanism for various needs. This option defines macro __ARC_LPC_WIDTH__ with the value of width.

-mrf16

This option instructs the compiler to generate code for a 16-entry register file. This option defines the __ARC_RF16__ preprocessor macro.

The following options are passed through to the assembler, and also define preprocessor macro symbols.

-mdsp-packa

Passed down to the assembler to enable the DSP Pack A extensions. Also sets the preprocessor symbol __Xdsp_packa. This option is deprecated.

-mdvbf

Passed down to the assembler to enable the dual Viterbi butterfly extension. Also sets the preprocessor symbol __Xdvbf. This option is deprecated.

-mlock

Passed down to the assembler to enable the locked load/store conditional extension. Also sets the preprocessor symbol __Xlock.

-mmac-d16

Passed down to the assembler. Also sets the preprocessor symbol __Xxmac_d16. This option is deprecated.

-mmac-24

Passed down to the assembler. Also sets the preprocessor symbol __Xxmac_24. This option is deprecated.

-mrtsc

Passed down to the assembler to enable the 64-bit time-stamp counter extension instruction. Also sets the preprocessor symbol __Xrtsc. This option is deprecated.

-mswape

Passed down to the assembler to enable the swap byte ordering extension instruction. Also sets the preprocessor symbol __Xswape.

-mtelephony

Passed down to the assembler to enable dual- and single-operand instructions for telephony. Also sets the preprocessor symbol __Xtelephony. This option is deprecated.

-mxy

Passed down to the assembler to enable the XY memory extension. Also sets the preprocessor symbol __Xxy.

The following options control how the assembly code is annotated:

-misize

Annotate assembler instructions with estimated addresses.

-mannotate-align

Explain what alignment considerations lead to the decision to make an instruction short or long.

The following options are passed through to the linker:

-marclinux

Passed through to the linker, to specify use of the arclinux emulation. This option is enabled by default in tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets when profiling is not requested.

-marclinux_prof

Passed through to the linker, to specify use of the arclinux_prof emulation. This option is enabled by default in tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets when profiling is requested.

The following options control the semantics of generated code:

-mlong-calls

Generate calls as register indirect calls, thus providing access to the full 32-bit address range.

-mmedium-calls

Don’t use less than 25-bit addressing range for calls, which is the offset available for an unconditional branch-and-link instruction. Conditional execution of function calls is suppressed, to allow use of the 25-bit range, rather than the 21-bit range with conditional branch-and-link. This is the default for tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets.

-G num

Put definitions of externally-visible data in a small data section if that data is no bigger than num bytes. The default value of num is 4 for any ARC configuration, or 8 when we have double load/store operations.

-mno-sdata

Do not generate sdata references. This is the default for tool chains built for arc-linux-uclibc and arceb-linux-uclibc targets.

-mvolatile-cache

Use ordinarily cached memory accesses for volatile references. This is the default.

-mno-volatile-cache

Enable cache bypass for volatile references.

The following options fine tune code generation:

-malign-call

Do alignment optimizations for call instructions.

-mauto-modify-reg

Enable the use of pre/post modify with register displacement.

-mbbit-peephole

Enable bbit peephole2.

-mno-brcc

This option disables a target-specific pass in arc_reorg to generate compare-and-branch (brcc) instructions. It has no effect on generation of these instructions driven by the combiner pass.

-mcase-vector-pcrel

Use PC-relative switch case tables to enable case table shortening. This is the default for -Os.

-mcompact-casesi

Enable compact casesi pattern. This is the default for -Os, and only available for ARCv1 cores.

-mno-cond-exec

Disable the ARCompact-specific pass to generate conditional execution instructions.

Due to delay slot scheduling and interactions between operand numbers, literal sizes, instruction lengths, and the support for conditional execution, the target-independent pass to generate conditional execution is often lacking, so the ARC port has kept a special pass around that tries to find more conditional execution generation opportunities after register allocation, branch shortening, and delay slot scheduling have been done. This pass generally, but not always, improves performance and code size, at the cost of extra compilation time, which is why there is an option to switch it off. If you have a problem with call instructions exceeding their allowable offset range because they are conditionalized, you should consider using -mmedium-calls instead.

-mearly-cbranchsi

Enable pre-reload use of the cbranchsi pattern.

-mexpand-adddi

Expand adddi3 and subdi3 at RTL generation time into add.f, adc etc. This option is deprecated.

-mindexed-loads

Enable the use of indexed loads. This can be problematic because some optimizers then assume that indexed stores exist, which is not the case.

-mlra

Enable Local Register Allocation. This is still experimental for ARC, so by default the compiler uses standard reload (i.e. -mno-lra).

-mlra-priority-none

Don’t indicate any priority for target registers.

-mlra-priority-compact

Indicate target register priority for r0..r3 / r12..r15.

-mlra-priority-noncompact

Reduce target register priority for r0..r3 / r12..r15.

-mno-millicode

When optimizing for size (using -Os), prologues and epilogues that have to save or restore a large number of registers are often shortened by using call to a special function in libgcc; this is referred to as a millicode call. As these calls can pose performance issues, and/or cause linking issues when linking in a nonstandard way, this option is provided to turn off millicode call generation.

-mmixed-code

Tweak register allocation to help 16-bit instruction generation. This generally has the effect of decreasing the average instruction size while increasing the instruction count.

-mq-class

Enable ‘q’ instruction alternatives. This is the default for -Os.

-mRcq

Enable ‘Rcq’ constraint handling. Most short code generation depends on this. This is the default.

-mRcw

Enable ‘Rcw’ constraint handling. Most ccfsm condexec mostly depends on this. This is the default.

-msize-level=level

Fine-tune size optimization with regards to instruction lengths and alignment. The recognized values for level are:

0

No size optimization. This level is deprecated and treated like ‘1’.

1

Short instructions are used opportunistically.

2

In addition, alignment of loops and of code after barriers are dropped.

3

In addition, optional data alignment is dropped, and the option Os is enabled.

This defaults to ‘3’ when -Os is in effect. Otherwise, the behavior when this is not set is equivalent to level ‘1’.

-mtune=cpu

Set instruction scheduling parameters for cpu, overriding any implied by -mcpu=.

Supported values for cpu are

ARC600

Tune for ARC600 CPU.

ARC601

Tune for ARC601 CPU.

ARC700

Tune for ARC700 CPU with standard multiplier block.

ARC700-xmac

Tune for ARC700 CPU with XMAC block.

ARC725D

Tune for ARC725D CPU.

ARC750D

Tune for ARC750D CPU.

-mmultcost=num

Cost to assume for a multiply instruction, with ‘4’ being equal to a normal instruction.

-munalign-prob-threshold=probability

Set probability threshold for unaligning branches. When tuning for ‘ARC700’ and optimizing for speed, branches without filled delay slot are preferably emitted unaligned and long, unless profiling indicates that the probability for the branch to be taken is below probability. See Cross-profiling. The default is (REG_BR_PROB_BASE/2), i.e. 5000.

The following options are maintained for backward compatibility, but are now deprecated and will be removed in a future release:

-margonaut

Obsolete FPX.

-mbig-endian
-EB

Compile code for big-endian targets. Use of these options is now deprecated. Big-endian code is supported by configuring GCC to build arceb-elf32 and arceb-linux-uclibc targets, for which big endian is the default.

-mlittle-endian
-EL

Compile code for little-endian targets. Use of these options is now deprecated. Little-endian code is supported by configuring GCC to build arc-elf32 and arc-linux-uclibc targets, for which little endian is the default.

-mbarrel_shifter

Replaced by -mbarrel-shifter.

-mdpfp_compact

Replaced by -mdpfp-compact.

-mdpfp_fast

Replaced by -mdpfp-fast.

-mdsp_packa

Replaced by -mdsp-packa.

-mEA

Replaced by -mea.

-mmac_24

Replaced by -mmac-24.

-mmac_d16

Replaced by -mmac-d16.

-mspfp_compact

Replaced by -mspfp-compact.

-mspfp_fast

Replaced by -mspfp-fast.

-mtune=cpu

Values ‘arc600’, ‘arc601’, ‘arc700’ and ‘arc700-xmac’ for cpu are replaced by ‘ARC600’, ‘ARC601’, ‘ARC700’ and ‘ARC700-xmac’ respectively.

-multcost=num

Replaced by -mmultcost.


Next: , Previous: , Up: Submodel Options   [Contents][Index]