`
dmhorse
  • 浏览: 25018 次
  • 来自: ...
最近访客 更多访客>>
社区版块
存档分类
最新评论

EBP ESP Function calling

阅读更多

After some research, I find 3 articles about these topic to clearly clarify how function is calling.

 

http://unixwiz.net/techtips/win32-callconv-asm.html

http://www.installsetupconfig.com/win32programming/processtoolhelpapis12_1.html

http://joshcarter.com/books/pragprowrimo_2009/functions_and_parameters

 

 

ne of the "big picture" issues in looking at compiled C code is the function-calling conventions. These are the methods that a calling function and a called function agree on how parameters and return values should be passed between them, and how the stack is used by the function itself. The layout of the stack constitutes the "stack frame", and knowing how this works can go a long way to decoding how something works.

In C and modern CPU design conventions, the stack frame is a chunk of memory, allocated from the stack, at run-time, each time a function is called, to store its automatic variables. Hence nested or recursive calls to the same function, each successively obtain their own separate frames.

Physically, a function's stack frame is the area between the addresses contained in esp, the stack pointer, and ebp, the frame pointer (base pointer in Intel terminology). Thus, if a function pushes more values onto the stack, it is effectively growing its frame.

This is a very low-level view: the picture as seen from the C/C++ programmer is illustrated elsewhere:

• Unixwiz.net Tech Tip: Intel x86 Function-call Conventions - C Programmer's View

For the sake of discussion, we're using the terms that the Microsoft Visual C compiler uses to describe these conventions, even though other platforms may use other terms.

__cdecl (pronounced see-DECK-'ll rhymes with "heckle")
This convention is the most common because it supports semantics required by the C language. The C language supports variadic functions (variable argument lists, alá printf), and this means that the caller must clean up the stack after the function call: the called function has no way to know how to do this. It's not terribly optimal, but the C language semantics demand it.
__stdcall
Also known as __pascal, this requires that each function take a fixed number of parameters, and this means that the called function can do argument cleanup in one place rather than have this be scattered throughout the program in every place that calls it. The Win32 API primarily uses __stdcall.

It's important to note that these are merely conventions, and any collection of cooperating code can agree on nearly anything. There are other conventions (passing parameters in registers, for instance) that behave differently, and of course the optimizer can make mincemeat of any clear picture as well.

Our focus here is to provide an overview, and not an authoritative definition for these conventions.

Register use in the stack frame

In both __cdecl and __stdcall conventions, the same set of three registers is involved in the function-call frame:

%ESP - Stack Pointer
This 32-bit register is implicitly manipulated by several CPU instructions (PUSH, POP, CALL, and RET among others), it always points to the last element used on the stack (not the first freeelement): this means that the PUSH and POP operations would be specified in pseudo-C as:
*--ESP = value;	// push

value = *ESP++;	// pop
The "Top of the stack" is an occupied location, not a free one, and is at the lowest memory address.
%EBP - Base Pointer
This 32-bit register is used to reference all the function parameters and local variables in the current stack frame. Unlike the %esp register, the base pointer is manipulated only explicitly. This is sometimes called the "Frame Pointer".
%EIP - Instruction Pointer
This holds the address of the next CPU instruction to be executed, and it's saved onto the stack as part of the CALL instruction. As well, any of the "jump" instructions modify the %EIP directly.

Assembler notation

Virtually everybody in the Intel assembler world uses the Intel notation, but the GNU C compiler uses what they call the "AT&T syntax" for backwards compatibility. This seems to us to be a really dumb idea, but it's a fact of life.

There are minor notational differences between the two notations, but by far the most annoying is that the AT&T syntax reverses the source and destination operands. To move the immediate value 4 into the EAX register:

mov $4, %eax          // AT&T notation

mov eax, 4            // Intel notation

More recent GNU compilers have a way to generate the Intel syntax, but it's not clear if the GNU assembler takes it. In any case, we'll use the Intel notation exclusively.

There are other minor differences that are not of much concern to the reverse engineer.

Calling a __cdecl function

The best way to understand the stack organization is to see each step in calling a function with the __cdecl conventions. These steps are taken automatically by the compiler, and though not all of them are used in every case (sometimes no parameters, sometimes no local variables, sometimes no saved registers), but this shows the overall mechanism employed.

Push parameters onto the stack, from right to left
Parameters are pushed onto the stack, one at a time, from right to left. Whether the parameters are evaluated from right to left is a different matter, and in any case this is unspecified by the language and code should never rely on this. The calling code must keep track of how many bytes of parameters have been pushed onto the stack so it can clean it up later.
Call the function
Here, the processor pushes contents of the %EIP (instruction pointer) onto the stack, and it points to the first byte after the CALL instruction. After this finishes, the caller has lost control, and the callee is in charge. This step does not change the %ebp register.
Save and update the %ebp
Now that we're in the new function, we need a new local stack frame pointed to by %ebp, so this is done by saving the current %ebp (which belongs to the previous function's frame) and making it point to the top of the stack.
push ebp
mov  ebp, esp    // ebp « esp
Once %ebp has been changed, it can now refer directly to the function's arguments as 8(%ebp)12(%ebp). Note that 0(%ebp) is the old base pointer and 4(%ebp) is the old instruction pointer.
Save CPU registers used for temporaries
[__cdecl stack frame]If this function will use any CPU registers, it has to save the old values first lest it walk on data used by the calling functions. Each register to be used is pushed onto the stack one at a time, and the compiler must remember what it did so it can unwind it later.
Allocate local variables
The function may choose to use local stack-based variables, and they are allocated here simply by decrementing the stack pointer by the amount of space required. This is always done in four-byte chunks.
Now, the local variables are located on the stack between the %ebp and %esp registers, and though it would be possible to refer to them as offsets from either one, by convention the %ebp register is used. This means that -4(%ebp) refers to the first local variable.
Perform the function's purpose
At this point, the stack frame is set up correctly, and this is represented by the diagram to the right. All the parameters and locals are offsets from the %ebp register:
16(%ebp) - third function parameter
12(%ebp) - second function parameter
8(%ebp) - first function parameter
4(%ebp) - old %EIP (the function's "return address")
0(%ebp) - old %EBP (previous function's base pointer)
-4(%ebp) - first local variable
-8(%ebp) - second local variable
-12(%ebp) - third local variable
The function is free to use any of the registers that had been saved onto the stack upon entry, but it must not change the stack pointer or all Hell will break loose upon function return.
Release local storage
When the function allocates local, temporary space, it does so by decrementing from the stack point by the amount space needed, and this process must be reversed to reclaim that space. It's usually done by adding to the stack pointer the same amount which was subtracted previously, though a series of POP instructions could achieve the same thing.
Restore saved registers
For each register saved onto the stack upon entry, it must be restored from the stack in reverse order. If the "save" and "restore" phases don't match exactly, catastrophic stack corruptionwill occur.
Restore the old base pointer
The first thing this function did upon entry was save the caller's %ebp base pointer, and by restoring it now (popping the top item from the stack), we effectively discard the entire local stack frame and put the caller's frame back in play.
Return from the function
This is the last step of the called function, and the RET instruction pops the old %EIP from the stack and jumps to that location. This gives control back to the calling function. Only the stack pointer and instruction pointers are modified by a subroutine return.
Clean up pushed parameters
In the __cdecl convention, the caller must clean up the parameters pushed onto the stack, and this is done either by popping the stack into don't-care registers (for a few parameters) or by adding the parameter-block size to the stack pointer directly.

__cdecl -vs- __stdcall

The __stdcall convention is mainly used by the Windows API, and it's a bit more compact than __cdecl. The main difference is that any given function has a hard-coded set of parameters, and this cannot vary from call to call like it can in C (no "variadic functions").

Because the size of the parameter block is fixed, the burden of cleaning these parameters off the stack can be shifted to the called function, instead of being done by the calling function as in __cdecl. There are several effects of this:

  1. the code is a tiny bit smaller, because the parameter-cleanup code is found once — in the called function itself — rather than in every place the function is called. These may be only a few bytes per call, but for commonly-used functions it can add up. This presumably means that the code may be a tiny bit faster as well.
  2. calling the function with the wrong number of parameters is catastrophic - the stack will be badly misaligned, and general havoc will surely ensue.
  3. As an offshoot of #2, Microsoft Visual C takes special care of functions that are B{__stdcall}. Since the number of parameters is known at compile time, the compiler encodes the parameter byte count in the symbol name itself, and this means that calling the function wrong leads to a link error.

    For instance, the function int foo(int a, int b) would generate — at the assembler level — the symbol "_foo@8", where "8" is the number of bytes expected. This means that not only will a call with 1 or 3 parameters not resolve (due to the size mismatch), but neither will a call expecting the __cdecl parameters (which looks for _foo). It's a clever mechanism that avoids a lot of problems.

Variations and Notes

The x86 architecture provides a number of built-in mechanisms for assisting with frame management, but they don't seem to be commonly used by C compilers. Of particular interest is theENTER instruction, which handles most of the function-prolog code.

ENTER 10,0          PUSH ebp
                     MOV  ebp, esp
                     SUB  esp, 10

We're pretty sure these are functionally equivalent, but our 80386 processor reference suggests that the ENTER version is more compact (6 bytes -vs- 9) but slower (15 clocks -vs- 6). The newer processors are probably harder to pin down, but somebody has probably figured out that ENTER is slower. Sigh.

分享到:
评论

相关推荐

    80x86汇编 是32位的

    - `prolog`:保存当前的EBP,并更新EBP指向ESP。 - `epilog`:恢复ESP到EBP的位置,并弹出EBP。 通过以上分析,我们可以了解到80x86汇编语言在32位环境下的基本指令集、寄存器使用规则、内存模型以及函数调用约定等...

    来教你如何在vb里txt

    - `move ebp, esp` - `push edi` - `push edx` - `push ecx` - `push ebx` - `move eax, dwordptr [ebp + 8]` - `cpuid` - `mov edi, dwordptr [ebp + 12]` - `movedwordptr [edi], ebx` - `movedi, ...

    C语言函数调用规定.doc-综合文档

    例如,`function(1, 2)`在汇编中会变成`push 1`、`push 2`、`call function`,而函数体内部则会通过`esp`和`ebp`寄存器来访问和清理堆栈。 2. **cdecl调用约定**: `cdecl`是C语言默认的调用约定,也是最灵活的。...

    查看进程信息,方便排查问题

    查看进程信息,方便排查问题

    IDA Pro分析STM32F1xx插件

    IDA Pro分析STM32F1xx插件

    基于SSH的线上医疗报销系统.zip-毕设&课设&实训&大作业&竞赛&项目

    项目工程资源经过严格测试运行并且功能上ok,可实现复现复刻,拿到资料包后可实现复现出一样的项目,本人系统开发经验充足(全栈全领域),有任何使用问题欢迎随时与我联系,我会抽时间努力为您解惑,提供帮助 【资源内容】:包含源码+工程文件+说明等。答辩评审平均分达到96分,放心下载使用!可实现复现;设计报告也可借鉴此项目;该资源内项目代码都经过测试运行,功能ok 【项目价值】:可用在相关项目设计中,皆可应用在项目、毕业设计、课程设计、期末/期中/大作业、工程实训、大创等学科竞赛比赛、初期项目立项、学习/练手等方面,可借鉴此优质项目实现复刻,设计报告也可借鉴此项目,也可基于此项目来扩展开发出更多功能 【提供帮助】:有任何使用上的问题欢迎随时与我联系,抽时间努力解答解惑,提供帮助 【附带帮助】:若还需要相关开发工具、学习资料等,我会提供帮助,提供资料,鼓励学习进步 下载后请首先打开说明文件(如有);整理时不同项目所包含资源内容不同;项目工程可实现复现复刻,如果基础还行,也可在此程序基础上进行修改,以实现其它功能。供开源学习/技术交流/学习参考,勿用于商业用途。质量优质,放心下载使用

    matlab的小型的微电网仿真模型文件

    小型的微电网仿真模型,简单模拟了光伏,家庭负载变化的使用情况

    MATLAB代码实现:分布式电源接入对配电网运行影响深度分析与评估,MATLAB代码分析:分布式电源接入对配电网运行影响评估,MATLAB代码:分布式电源接入对配电网影响分析 关键词:分布式电源 配电

    MATLAB代码实现:分布式电源接入对配电网运行影响深度分析与评估,MATLAB代码分析:分布式电源接入对配电网运行影响评估,MATLAB代码:分布式电源接入对配电网影响分析 关键词:分布式电源 配电网 评估 参考文档:《自写文档,联系我看》参考选址定容模型部分; 仿真平台:MATLAB 主要内容:代码主要做的是分布式电源接入场景下对配电网运行影响的分析,其中,可以自己设置分布式电源接入配电网的位置,接入配电网的有功功率以及无功功率的大小,通过牛顿拉夫逊法求解分布式电源接入后的电网潮流,从而评价分布式电源接入前后的电压、线路潮流等参数是否发生变化,评估配电网的运行方式。 代码非常精品,是研究含分布式电源接入的电网潮流计算的必备程序 ,分布式电源; 配电网; 接入影响分析; 潮流计算; 牛顿拉夫逊法; 电压评估; 必备程序。,基于MATLAB的分布式电源对配电网影响评估系统

    基于Unity-Bolt开发的游戏demo.zip

    项目工程资源经过严格测试运行并且功能上ok,可实现复现复刻,拿到资料包后可实现复现出一样的项目,本人系统开发经验充足(全栈全领域),有任何使用问题欢迎随时与我联系,我会抽时间努力为您解惑,提供帮助 【资源内容】:包含源码+工程文件+说明等。答辩评审平均分达到96分,放心下载使用!可实现复现;设计报告也可借鉴此项目;该资源内项目代码都经过测试运行,功能ok 【项目价值】:可用在相关项目设计中,皆可应用在项目、毕业设计、课程设计、期末/期中/大作业、工程实训、大创等学科竞赛比赛、初期项目立项、学习/练手等方面,可借鉴此优质项目实现复刻,设计报告也可借鉴此项目,也可基于此项目来扩展开发出更多功能 【提供帮助】:有任何使用上的问题欢迎随时与我联系,抽时间努力解答解惑,提供帮助 【附带帮助】:若还需要相关开发工具、学习资料等,我会提供帮助,提供资料,鼓励学习进步 下载后请首先打开说明文件(如有);整理时不同项目所包含资源内容不同;项目工程可实现复现复刻,如果基础还行,也可在此程序基础上进行修改,以实现其它功能。供开源学习/技术交流/学习参考,勿用于商业用途。质量优质,放心下载使用

    重庆市农村信用合作社 农商行数字银行系统建设方案.ppt

    重庆市农村信用合作社 农商行数字银行系统建设方案.ppt

    光伏并网逆变器设计方案与高效实现:结合matlab电路仿真、DSP代码及环流抑制策略,光伏并网逆变器设计方案:结合matlab电路文件与DSP程序代码,实现高效并联环流抑制策略,光伏并网逆变器设计方案

    光伏并网逆变器设计方案与高效实现:结合matlab电路仿真、DSP代码及环流抑制策略,光伏并网逆变器设计方案:结合matlab电路文件与DSP程序代码,实现高效并联环流抑制策略,光伏并网逆变器设计方案,附有相关的matlab电路文件,以及DSP的程序代码,方案、仿真文件、代码三者结合使用效果好,事半功倍。 备注:赠送逆变器并联环流matlab文件,基于矢量控制的环流抑制策略和下垂控制的环流抑制 ,光伏并网逆变器设计方案; MATLAB电路文件; DSP程序代码; 方案、仿真文件、代码结合使用; 并联环流抑制策略; 下垂控制的环流抑制,光伏并网逆变器优化设计:方案、仿真与DSP程序代码三合一,并赠送并联环流抑制策略Matlab文件

    Matlab实现WOA-GRU鲸鱼算法优化门控循环单元的数据多输入分类预测(含模型描述及示例代码)

    内容概要:本文介绍了通过 Matlab 实现鲸鱼优化算法(WOA)与门控循环单元(GRU)结合的多输入分类预测模型。文章首先概述了时间序列预测的传统方法局限性以及引入 WOA 的优势。然后,重点阐述了项目背景、目标、挑战及其独特之处。通过详细介绍数据预处理、模型构建、训练和评估步骤,最终展示了模型的效果预测图及应用实例。特别强调利用 WOA 改善 GRU 的参数设置,提高了多输入时间序列预测的准确性与鲁棒性。 适合人群:对时间序列分析有兴趣的研究者,从事金融、能源、制造业等行业数据分析的专业人士,具备一定的机器学习基础知识和技术经验。 使用场景及目标:本项目旨在开发一个高度准确和稳定的多变量时间序列预测工具,能够用于金融市场预测、能源需求规划、生产调度优化等领域,为企业和个人提供科学决策依据。 其他说明:项目提供的源代码和详细的开发指南有助于学习者快速掌握相关技能,并可根据实际需求调整模型参数以适应不同的业务情境。

    基于vue+elment-ui+node.js的后台管理系统 .zip(毕设&课设&实训&大作业&竞赛&项目)

    项目工程资源经过严格测试运行并且功能上ok,可实现复现复刻,拿到资料包后可实现复现出一样的项目,本人系统开发经验充足(全栈全领域),有任何使用问题欢迎随时与我联系,我会抽时间努力为您解惑,提供帮助 【资源内容】:包含源码+工程文件+说明等。答辩评审平均分达到96分,放心下载使用!可实现复现;设计报告也可借鉴此项目;该资源内项目代码都经过测试运行,功能ok 【项目价值】:可用在相关项目设计中,皆可应用在项目、毕业设计、课程设计、期末/期中/大作业、工程实训、大创等学科竞赛比赛、初期项目立项、学习/练手等方面,可借鉴此优质项目实现复刻,设计报告也可借鉴此项目,也可基于此项目来扩展开发出更多功能 【提供帮助】:有任何使用上的问题欢迎随时与我联系,抽时间努力解答解惑,提供帮助 【附带帮助】:若还需要相关开发工具、学习资料等,我会提供帮助,提供资料,鼓励学习进步 下载后请首先打开说明文件(如有);整理时不同项目所包含资源内容不同;项目工程可实现复现复刻,如果基础还行,也可在此程序基础上进行修改,以实现其它功能。供开源学习/技术交流/学习参考,勿用于商业用途。质量优质,放心下载使用

    Python 实现基于BiLSTM-AdaBoost双向长短期记忆网络结合AdaBoost多输入分类预测(含模型描述及示例代码)

    内容概要:本文介绍了Python中基于双向长短期记忆网络(BiLSTM)与AdaBoost相结合的多输入分类预测模型的设计与实现。BiLSTM擅长捕捉时间序列的双向依赖关系,而AdaBoost则通过集成弱学习器来提高分类精度和稳定性。文章详述了该项目的背景、目标、挑战、特色和应用场景,并提供了详细的模型构建流程、超参数优化以及视觉展示的方法和技术要点。此外,还附有完整的效果预测图表程序和具体示例代码,使读者可以快速上手构建属于自己的高效稳定的时间序列预测系统。 适合人群:对深度学习特别是时序数据分析感兴趣的开发者或者科研工作者;正在探索高级机器学习技术和寻求解决方案的企业分析师。 使用场景及目标:适用于希望提升时间序列或多输入数据类别判定准确度的业务情境,比如金融市场的走势预估、医学图像分析中的病变区域判读或是物联网环境监测下设备状态预警等任务。目的是为了创建更加智能且可靠的预测工具,在实际应用中带来更精准可靠的结果。 其他说明:文中提供的所有Python代码片段和方法都可以直接运用于实践中,并可根据特定的问题进行相应调整和扩展,进一步改进现有系统的效能并拓展新的功能特性。

    maven-script-interpreter-javadoc-1.0-7.el7.x64-86.rpm.tar.gz

    1、文件内容:maven-script-interpreter-javadoc-1.0-7.el7.rpm以及相关依赖 2、文件形式:tar.gz压缩包 3、安装指令: #Step1、解压 tar -zxvf /mnt/data/output/maven-script-interpreter-javadoc-1.0-7.el7.tar.gz #Step2、进入解压后的目录,执行安装 sudo rpm -ivh *.rpm 4、更多资源/技术支持:公众号禅静编程坊

    在云服务器上搭建MQTT服务器(超详细,一步到位)

    在云服务器上搭建MQTT服务器(超详细,一步到位)

    复现改进的L-SHADE差分进化算法求解最优化问题详解:附MATLAB源码与测试函数集,复现改进的L-SHADE差分进化算法求解最优化问题详解:MATLAB源码与测试集全攻略,复现改进的L-SHADE

    复现改进的L-SHADE差分进化算法求解最优化问题详解:附MATLAB源码与测试函数集,复现改进的L-SHADE差分进化算法求解最优化问题详解:MATLAB源码与测试集全攻略,复现改进的L-SHADE差分进化算法求最优化问题 对配套文献所提出的改进的L-SHADE差分进化算法求解最优化问题的的复现,提供完整MATLAB源代码和测试函数集,到手可运行,运行效果如图2所示。 代码所用测试函数集与文献相同:对CEC2014最优化测试函数集中的全部30个函数进行了测试验证,运行结果与文献一致。 ,复现; 改进的L-SHADE差分进化算法; 最优化问题求解; MATLAB源代码; 测试函数集; CEC2014最优化测试函数集,复现改进L-SHADE算法:最优化问题的MATLAB求解与验证

    天津大学:深度解读DeepSeek原理与效应.pdf

    天津大学:深度解读DeepSeek原理与效应.pdf 1.大语言模型发展路线图 2.DeepSeek V2-V3/R1技术原理 3DeepSeek效应 4.未来展望

    光伏混合储能微电网能量管理系统模型:基于MPPT控制的光伏发电与一阶低通滤波算法的混合储能系统优化管理,光伏混合储能微电网能量优化管理与稳定运行系统,光伏-混合储能微电网能量管理系统模型

    光伏混合储能微电网能量管理系统模型:基于MPPT控制的光伏发电与一阶低通滤波算法的混合储能系统优化管理,光伏混合储能微电网能量优化管理与稳定运行系统,光伏-混合储能微电网能量管理系统模型 系统主要由光伏发电模块、mppt控制模块、混合储能系统模块、直流负载模块、soc限值管理控制模块、hess能量管理控制模块。 光伏发电系统采用mppt最大跟踪控制,实现光伏功率的稳定输出;混合储能系统由蓄电池和超级电容组合构成,并采用一阶低通滤波算法实现两种储能介质间的功率分配,其中蓄电池响应目标功率中的低频部分,超级电容响应目标功率中的高频部分,最终实现对目标功率的跟踪响应;SOC限值管理控制,根据储能介质的不同特性,优化混合储能功率分配,进一步优化蓄电池充放电过程,再根据超级电容容量特点,设计其荷电状态区分管理策略,避免过充过放,维持系统稳定运行;最后,综合混合储能和系统功率平衡,针对光伏储能微电网的不同工况进行仿真实验,验证控制策略的有效性。 本模型完整无错,附带对应复现文献paper,容易理解,可塑性高 ,光伏; 混合储能系统; 能量管理; MPPT控制; 直流负载;

    Matlab算法下的A星路径规划改进版:提升搜索效率,优化拐角并路径平滑处理,Matlab下的A星算法改进:提升搜索效率、冗余拐角优化及路径平滑处理,Matlab算法代码 A星算法 路径规划A* As

    Matlab算法下的A星路径规划改进版:提升搜索效率,优化拐角并路径平滑处理,Matlab下的A星算法改进:提升搜索效率、冗余拐角优化及路径平滑处理,Matlab算法代码 A星算法 路径规划A* Astar算法仿真 传统A*+改进后的A*算法 Matlab代码 改进: ①提升搜索效率(引入权重系数) ②冗余拐角优化(可显示拐角优化次数) ③路径平滑处理(引入梯度下降算法配合S-G滤波器) ,Matlab算法代码; A星算法; 路径规划A*; Astar算法仿真; 传统A*; 改进A*算法; 提升搜索效率; 冗余拐角优化; 路径平滑处理; 权重系数; S-G滤波器。,Matlab中的A*算法:传统与改进的路径规划仿真研究

Global site tag (gtag.js) - Google Analytics