GCC中关于浮点运算的问题

咖啡猪猪

浏览: 16978 次
性别:
来自: 杭州

最近访客更多访客>>

NEU_tt

houyousen123_

kanken005

lovefeixian

博主相关

博客

微博

相册

留言

关于我

文章分类

全部博客 (6)

社区版块

存档分类

GCC Ubuntu 读书 C C++

最近读书《深入理解计算机系统》里第二章中的“Intel IA32 浮点运算”，发现其中给出的测试程序有些问题：

浮点数寄存器使用的是80位的扩展精度格式

float 类型使用的是32位精度格式

double 类型使用的是64位精度格式

书中给出的例子是

#include<stdio.h>

double recip(int denom){
	return 1.0/(double) denom;
}

void do_nothing(){}

void test1(int denom){
	double r1, r2;
	int t1, t2;

	r1 = recip(denom);
	r2 = recip(denom);
	t1 = r1 == r2;
	do_nothing();
	t2 = r1 == r2;
	printf("test1 t1: r1 %f %c= r2 %f\n", r1, t1 ? '=' : '!', r2);
	printf("test1 t2: r1 %f %c= r2 %f\n", r1, t2 ? '=' : '!', r2);
}

main(){
	test1(10);
}

我的系统是ubuntu9.04

gcc 版本 4.3.3 (Ubuntu 4.3.3-5ubuntu4)

第一步：不带优化的编译

coffee@coffee-laptop:~$ gcc -o test test.c
coffee@coffee-laptop:~$ ./test
test1 t1: r1 0.100000 == r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000

第二步：带有O2优化的编译

coffee@coffee-laptop:~$ gcc -O2 -o test test.c
coffee@coffee-laptop:~$ ./test
test1 t1: r1 0.100000 == r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000

运行结果并不是意料中的

test1 t1: r1 0.100000 != r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000

加入书中给出的函数2

void test2(int denom){
  double r1;
  int t1;
  r1 = recip(denom);
  t1 = r1 == 1.0/(double) denom;
  printf("test2 t1: r1 %f %c= 1.0/10.0\n", r1, t1 ? '=' : '!');
}

第一步：不带优化的编译

coffee@coffee-laptop:~$ gcc -o test test.c
coffee@coffee-laptop:~$ ./test
test1 t1: r1 0.100000 == r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000
test2 t1: r1 0.100000 != 1.0/10.0

第二步：带有O2优化的编译

coffee@coffee-laptop:~$ gcc -O2 -o test test.c
coffee@coffee-laptop:~$ ./test
test1 t1: r1 0.100000 == r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000
test2 t1: r1 0.100000 == 1.0/10.0

这里主要有两方面的问题

1.是浮点寄存器的使用问题，保存在浮点寄存器中的浮点数并不等于内存中的浮点数

2.GCC对浮点数的支持，特别是在使用了O2优化编译的时候

明天继续研究，今天到此，先休息！

--------------------------------------------------------------------------------------------------------------------

继续 2010－01－30（PS：昨天和同学逛街去了，哇哈哈！）

首先，对源程序的编译采用命令行选项 -ffloat-store，该命令将每一个浮点计算的结果在使用之前都必须存储到存储器中，

测试结果

coffee@coffee-laptop:~$ gcc -ffloat-store -o test3 test.c
coffee@coffee-laptop:~$ ./test3
test1 t1: r1 0.100000 == r2 0.100000
test1 t2: r1 0.100000 == r2 0.100000
test2 t1: r1 0.100000 == 1.0/10.0

查了些资料，国内对这部分的解释还是太少了，通过优编译－－反汇编，比较得出的三种不同的反编译的汇编程序，得出结论。原来在不带优化的编译中，浮点计算的结果先是保存在浮点寄存器，采用的是80位的扩展精度格式，即r1；而r2计算的结果已经被转换到64位的double类型，所以比较的结果不同。

参考的反汇编

不带有优化的编译，下面的是函数test2的反汇编代码

  da:	55                   	push   %ebp
  db:	89 e5                	mov    %esp,%ebp
  dd:	83 ec 28             	sub    $0x28,%esp
  e0:	8b 45 08             	mov    0x8(%ebp),%eax
  e3:	89 04 24             	mov    %eax,(%esp)
  e6:	e8 fc ff ff ff       	call   e7 <test2+0xd>
  eb:	dd 5d f0             	fstpl  -0x10(%ebp)
  ee:	db 45 08             	fildl  0x8(%ebp)
  f1:	d9 e8                	fld1   
  f3:	de f1                	fdivp  %st,%st(1)
  f5:	dd 45 f0             	fldl   -0x10(%ebp)
  f8:	da e9                	fucompp 
  fa:	df e0                	fnstsw %ax
  fc:	9e                   	sahf   
  fd:	0f 94 c0             	sete   %al
 100:	0f 9b c2             	setnp  %dl
 103:	21 d0                	and    %edx,%eax
 105:	0f b6 c0             	movzbl %al,%eax
 108:	89 45 fc             	mov    %eax,-0x4(%ebp)

分享到：