`
xujinquan19
  • 浏览: 150591 次
  • 性别: Icon_minigender_1
  • 来自: 广州
社区版块
存档分类
最新评论

GCC对ARM支持的所有优化选项及指令

阅读更多

3.17.1 ARM Options

These `-m' options are defined for Advanced RISC Machines (ARM) architectures:

-mabi=name
Generate code for the specified ABI. Permissible values are: `apcs-gnu', `atpcs', `aapcs', `aapcs-linux' and `iwmmxt'.
-mapcs-frame
Generate a stack frame that is compliant with the ARM Procedure Call Standard for all functions, even if this is not strictly necessary for correct execution of the code. Specifying -fomit-frame-pointer with this option will cause the stack frames not to be generated for leaf functions. The default is -mno-apcs-frame.
-mapcs
This is a synonym for -mapcs-frame.
-mthumb-interwork
Generate code which supports calling between the ARM and Thumb instruction sets. Without this option the two instruction sets cannot be reliably used inside one program. The default is -mno-thumb-interwork, since slightly larger code is generated when -mthumb-interwork is specified.
-mno-sched-prolog
Prevent the reordering of instructions in the function prolog, or the merging of those instruction with the instructions in the function's body. This means that all functions will start with a recognizable set of instructions (or in fact one of a choice from a small set of different function prologues), and this information can be used to locate the start if functions inside an executable piece of code. The default is -msched-prolog.
-mfloat-abi=name
Specifies which floating-point ABI to use. Permissible values are: `soft', `softfp' and `hard'.

Specifying `soft' causes GCC to generate output containing library calls for floating-point operations. `softfp' allows the generation of code using hardware floating-point instructions, but still uses the soft-float calling conventions. `hard' allows generation of floating-point instructions and uses FPU-specific calling conventions.

The default depends on the specific target configuration. Note that the hard-float and soft-float ABIs are not link-compatible; you must compile your entire program with the same ABI, and link with a compatible set of libraries.

-mlittle-endian
Generate code for a processor running in little-endian mode. This is the default for all standard configurations.
-mbig-endian
Generate code for a processor running in big-endian mode; the default is to compile code for a little-endian processor.
-mwords-little-endian
This option only applies when generating code for big-endian processors. Generate code for a little-endian word order but a big-endian byte order. That is, a byte order of the form `32107654'. Note: this option should only be used if you require compatibility with code for big-endian ARM processors generated by versions of the compiler prior to 2.8.
-mcpu=name
This specifies the name of the target ARM processor. GCC uses this name to determine what kind of instructions it can emit when generating assembly code. Permissible names are: `arm2', `arm250', `arm3', `arm6', `arm60', `arm600', `arm610', `arm620', `arm7', `arm7m', `arm7d', `arm7dm', `arm7di', `arm7dmi', `arm70', `arm700', `arm700i', `arm710', `arm710c', `arm7100', `arm720', `arm7500', `arm7500fe', `arm7tdmi', `arm7tdmi-s', `arm710t', `arm720t', `arm740t', `strongarm', `strongarm110', `strongarm1100', `strongarm1110', `arm8', `arm810', `arm9', `arm9e', `arm920', `arm920t', `arm922t', `arm946e-s', `arm966e-s', `arm968e-s', `arm926ej-s', `arm940t', `arm9tdmi', `arm10tdmi', `arm1020t', `arm1026ej-s', `arm10e', `arm1020e', `arm1022e', `arm1136j-s', `arm1136jf-s', `mpcore', `mpcorenovfp', `arm1156t2-s', `arm1156t2f-s', `arm1176jz-s', `arm1176jzf-s', `cortex-a5', `cortex-a8', `cortex-a9', `cortex-a15', `cortex-r4', `cortex-r4f', `cortex-m4', `cortex-m3', `cortex-m1', `cortex-m0', `xscale', `iwmmxt', `iwmmxt2', `ep9312'.
-mtune=name
This option is very similar to the -mcpu= option, except that instead of specifying the actual target processor type, and hence restricting which instructions can be used, it specifies that GCC should tune the performance of the code as if the target were of the type specified in this option, but still choosing the instructions that it will generate based on the CPU specified by a -mcpu= option. For some ARM implementations better performance can be obtained by using this option.
-march=name
This specifies the name of the target ARM architecture. GCC uses this name to determine what kind of instructions it can emit when generating assembly code. This option can be used in conjunction with or instead of the -mcpu= option. Permissible names are: `armv2', `armv2a', `armv3', `armv3m', `armv4', `armv4t', `armv5', `armv5t', `armv5e', `armv5te', `armv6', `armv6j', `armv6t2', `armv6z', `armv6zk', `armv6-m', `armv7', `armv7-a', `armv7-r', `armv7-m', `iwmmxt', `iwmmxt2', `ep9312'.
-mfpu=name
-mfpe=number
-mfp=number
This specifies what floating point hardware (or hardware emulation) is available on the target. Permissible names are: `fpa', `fpe2', `fpe3', `maverick', `vfp', `vfpv3', `vfpv3-fp16', `vfpv3-d16', `vfpv3-d16-fp16', `vfpv3xd', `vfpv3xd-fp16', `neon', `neon-fp16', `vfpv4', `vfpv4-d16', `fpv4-sp-d16' and `neon-vfpv4'. -mfp and -mfpe are synonyms for -mfpu=`fpe'number, for compatibility with older versions of GCC.

If -msoft-float is specified this specifies the format of floating point values.

If the selected floating-point hardware includes the NEON extension (e.g. -mfpu=`neon'), note that floating-point operations will not be used by GCC's auto-vectorization pass unless -funsafe-math-optimizations is also specified. This is because NEON hardware does not fully implement the IEEE 754 standard for floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision.

-mfp16-format=name
Specify the format of the __fp16 half-precision floating-point type. Permissible names are `none', `ieee', and `alternative'; the default is `none', in which case the __fp16 type is not defined. See Half-Precision, for more information.
-mstructure-size-boundary=n
The size of all structures and unions will be rounded up to a multiple of the number of bits set by this option. Permissible values are 8, 32 and 64. The default value varies for different toolchains. For the COFF targeted toolchain the default value is 8. A value of 64 is only allowed if the underlying ABI supports it.

Specifying the larger number can produce faster, more efficient code, but can also increase the size of the program. Different values are potentially incompatible. Code compiled with one value cannot necessarily expect to work with code or libraries compiled with another value, if they exchange information using structures or unions.

-mabort-on-noreturn
Generate a call to the function abort at the end of a noreturn function. It will be executed if the function tries to return.
-mlong-calls
-mno-long-calls
Tells the compiler to perform function calls by first loading the address of the function into a register and then performing a subroutine call on this register. This switch is needed if the target function will lie outside of the 64 megabyte addressing range of the offset based version of subroutine call instruction.

Even if this switch is enabled, not all function calls will be turned into long calls. The heuristic is that static functions, functions which have the `short-call' attribute, functions that are inside the scope of a `#pragma no_long_calls' directive and functions whose definitions have already been compiled within the current compilation unit, will not be turned into long calls. The exception to this rule is that weak function definitions, functions with the `long-call' attribute or the `section' attribute, and functions that are within the scope of a `#pragma long_calls' directive, will always be turned into long calls.

This feature is not enabled by default. Specifying -mno-long-calls will restore the default behavior, as will placing the function calls within the scope of a `#pragma long_calls_off' directive. Note these switches have no effect on how the compiler generates code to handle function calls via function pointers.

-msingle-pic-base
Treat the register used for PIC addressing as read-only, rather than loading it in the prologue for each function. The run-time system is responsible for initializing this register with an appropriate value before execution begins.
-mpic-register=reg
Specify the register to be used for PIC addressing. The default is R10 unless stack-checking is enabled, when R9 is used.
-mcirrus-fix-invalid-insns
Insert NOPs into the instruction stream to in order to work around problems with invalid Maverick instruction combinations. This option is only valid if the -mcpu=ep9312 option has been used to enable generation of instructions for the Cirrus Maverick floating point co-processor. This option is not enabled by default, since the problem is only present in older Maverick implementations. The default can be re-enabled by use of the -mno-cirrus-fix-invalid-insns switch.
-mpoke-function-name
Write the name of each function into the text section, directly preceding the function prologue. The generated code is similar to this:
               t0
                   .ascii "arm_poke_function_name", 0
                   .align
               t1
                   .word 0xff000000 + (t1 - t0)
               arm_poke_function_name
                   mov     ip, sp
                   stmfd   sp!, {fp, ip, lr, pc}
                   sub     fp, ip, #4
     

When performing a stack backtrace, code can inspect the value of pc stored at fp + 0. If the trace function then looks at location pc - 12 and the top 8 bits are set, then we know that there is a function name embedded immediately preceding this location and has length ((pc[-3]) & 0xff000000).

-mthumb
Generate code for the Thumb instruction set. The default is to use the 32-bit ARM instruction set. This option automatically enables either 16-bit Thumb-1 or mixed 16/32-bit Thumb-2 instructions based on the -mcpu=name and -march=name options. This option is not passed to the assembler. If you want to force assembler files to be interpreted as Thumb code, either add a `.thumb' directive to the source or pass the -mthumb option directly to the assembler by prefixing it with -Wa.
-mtpcs-frame
Generate a stack frame that is compliant with the Thumb Procedure Call Standard for all non-leaf functions. (A leaf function is one that does not call any other functions.) The default is -mno-tpcs-frame.
-mtpcs-leaf-frame
Generate a stack frame that is compliant with the Thumb Procedure Call Standard for all leaf functions. (A leaf function is one that does not call any other functions.) The default is -mno-apcs-leaf-frame.
-mcallee-super-interworking
Gives all externally visible functions in the file being compiled an ARM instruction set header which switches to Thumb mode before executing the rest of the function. This allows these functions to be called from non-interworking code. This option is not valid in AAPCS configurations because interworking is enabled by default.
-mcaller-super-interworking
Allows calls via function pointers (including virtual functions) to execute correctly regardless of whether the target code has been compiled for interworking or not. There is a small overhead in the cost of executing a function pointer if this option is enabled. This option is not valid in AAPCS configurations because interworking is enabled by default.
-mtp=name
Specify the access model for the thread local storage pointer. The valid models are soft, which generates calls to __aeabi_read_tp, cp15, which fetches the thread pointer from cp15 directly (supported in the arm6k architecture), and auto, which uses the best available method for the selected processor. The default setting is auto.
-mword-relocations
Only generate absolute relocations on word sized values (i.e. R_ARM_ABS32). This is enabled by default on targets (uClinux, SymbianOS) where the runtime loader imposes this restriction, and when -fpic or -fPIC is specified.
-mfix-cortex-m3-ldrd
Some Cortex-M3 cores can cause data corruption when ldrd instructions with overlapping destination and base registers are used. This option avoids generating these instructions. This option is enabled by default when -mcpu=cortex-m3 is specified.
分享到:
评论

相关推荐

    gcc_arm-linux-gcc_arm-elf-gcc.rar_arm linux gcc_arm-elf-gcc_elf_

    5. ARM-LINUX-GCC和ARM-ELF-GCC的配置选项,以及如何定制编译器以适应特定的硬件和软件需求。 6. 编译、链接和调试在ARM设备上运行的Linux程序的步骤。 7. 对比ARM-LINUX-GCC和ARM-ELF-GCC在功能和使用场景上的异同...

    arm-linux-gcc编译选项.pdf

    - 预处理阶段:GCC会对源文件进行预处理,展开宏定义、处理条件编译指令、包含头文件等。 - 编译阶段:经过预处理的源文件会被转化为汇编代码。 - 汇编阶段:将汇编代码编译成机器可以理解的二进制代码(目标文件...

    arm-linux-gcc-4.3.2.rar

    8. 代码调试:arm-linux-gcc-4.3.2支持生成调试信息,通过-g选项可以生成GDB(GNU Debugger)可以识别的调试信息,方便开发者在目标系统上进行远程调试。 9. 静态与动态链接:编译器可以选择静态或动态链接库。静态...

    arm-linux-gcc-5.4.0交叉编译工具.rar

    - **目标架构设置**:使用 `-march` 或 `-mtune` 编译选项指定目标ARM处理器的类型,以优化生成的代码。 - **库路径和头文件**:确保链接和包含正确的库和头文件,这些通常位于交叉编译工具链的安装目录下。 - **...

    arm-linux-gcc_4.9.1

    7. **编译选项和优化**:学习如何使用GCC的编译选项来优化代码,如-O2或-O3级别的优化,以及如何处理特定硬件特性。 8. **调试工具**:了解GDB(GNU调试器)等工具,用于在目标设备上远程调试交叉编译的代码。 9. ...

    arm-linux-gcc4.1.2

    这包括对ARM指令集的理解以及对Linux系统调用和库函数的支持。 4. **版本号的意义** 版本号4.1.2表示这是一个主要版本为4,次要版本为1,修正版本为2的GCC发行版。每个版本号的增加通常代表功能的增强、错误修复或...

    arm-linux-gcc

    不同版本的`arm-linux-gcc`可能包含不同的性能优化和对新特性的支持。 安装`arm-linux-gcc`通常涉及解压文件,配置编译选项,然后进行编译和安装。例如,对于`.tar.bz2`格式的文件,可以使用以下步骤: 1. 解压...

    arm linux 交叉编译工具gcc-4.8.3

    GCC 4.8.3是其中的一个版本,发布于2014年,包含了对C11、C++11等标准的支持。 **2. ARM架构** ARM(Advanced RISC Machines)是一种精简指令集计算(RISC)架构,广泛应用于移动设备、嵌入式系统以及服务器等。...

    arm-linux-gcc3.4.1.rar

    1. **优化编译**:GCC支持多种优化级别,如-O1、-O2和-O3,用于提高代码执行效率,同时可以开启特定的ARM指令集优化,如 Thumb 模式和 ARM 模式的混合使用。 2. **调试支持**:通过-g选项,GCC可以生成包含调试信息...

    arm-linux-gcc 安装

    这里,`--target` 指定了目标平台,`--prefix` 指定了安装路径,`--enable-languages` 选择了支持的语言,`--disable-multilib` 禁用了多库支持,如果你需要多库支持,可以移除这个选项。 ### 4. 编译和安装 配置...

    arm-linux-gcc-4.4.3

    3. **架构差异**:注意ARM架构的特性,比如指令集差异、内存模型等,可能需要调整源代码或编译选项以适应。 4. **动态链接**:如果编译动态库,还需处理动态链接问题,可能需要在目标系统上配置动态链接器和库路径...

    arm-linux-gcc_4.5.1

    在这个特定的版本中,GCC针对ARM架构进行了优化,能够在X86或X86_64等非ARM系统上编译出能够在ARM设备上运行的二进制代码。 搭建交叉编译环境的步骤一般包括以下几个关键部分: 1. **选择合适的工具链**: ARM-...

    linux中gcc4.8.5,下载解压即可直接使用,linux系统GCC编译

    2. 高级优化:GCC 4.8.5包含一系列优化选项,如-O2和-O3,能对代码进行更深层次的优化,提升程序运行效率。 3. 支持C++11标准:在4.8系列中,GCC对C++11标准的支持得到增强,包括lambda表达式、右值引用、自动类型...

    gcc9.1.0-9.3.0.rar全集

    这包括对特定指令集架构(如x86、ARM等)的优化,以及对并行处理和多线程代码的优化,使得程序在现代硬件上的运行速度更快。 2. **C++新特性支持**:GCC 9.x版本增加了对C++17和C++20标准的更多支持,例如引入了`if...

    arm-linux-gcc 4.4.3版本

    - 对于旧版本的编译器,可能不支持某些现代C++特性或GCC的优化选项。 **七、替代方案** 随着时间的推移,更现代的交叉编译工具链如`arm-none-eabi-gcc`和`aarch64-linux-gnu-gcc`已被开发出来,分别对应ARM Cortex-...

    ARM NEON优化开发

    NEON优化是一个复杂的过程,需要开发者具备对ARM架构和NEON指令集深入的理解,以及对目标应用场景的性能需求有明确的认识。 需要注意的是,NEON优化通常适用于处理大量数据的应用,例如图像和视频处理、机器学习...

    arm-linux-gcc常用参数讲解gcc编译器使用方法参照.pdf

    ARM Linux GCC是一个针对ARM架构的交叉编译器,用于生成能在ARM处理器上运行的代码。在使用GCC编译器时,通常会涉及到一系列的参数,这些参数有助于控制编译过程的不同阶段,以及优化生成的代码。以下是针对这些参数...

    ARM_GCC内嵌汇编

    在ARM架构中,GCC(GNU Compiler Collection)编译器提供了内嵌汇编的支持,允许开发者在高级语言中直接插入汇编指令,从而实现更精细的控制。 内嵌汇编的基本语法结构通常包含以下几个部分: 1. **汇编指令**:这...

    arm-linux-gcc-4.3.2交叉编译器

    GCC 4.3.2 版本发布于 2008 年,相对于更早的版本,它引入了一些改进和优化,如更好的诊断信息、对 C++0x 标准的初步支持、性能提升等。不过,随着技术的发展,后来的版本(如 4.8、5.x、6.x、7.x 和 8.x 等)提供...

    arm-linux-gcc-3.3.2.tar.gz

    4. 库文件支持:GCC工具链还提供了对标准库的支持,如libc、libstdc++等,使得开发者可以方便地调用库函数,加速开发进程。 5. 版本选择:尽管GCC-3.3.2相对较老,但在某些特定应用场景下,可能仍然需要使用这个...

Global site tag (gtag.js) - Google Analytics