glibc中strlen的实现

simohayha

浏览: 1412659 次
性别:
来自: 火星

最近访客更多访客>>

huangyongxing

myprint

zkbucciarati

deyimsf

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

c/c++

RedHat CVS 工作 performance CGI

glibc中的strlen的实现主要的思想就是每次检测4个字节(long int)。这样的话就降低了循环的次数，从而从整体上提高了效率。

这里它使用了两个技巧，一个是由于传进来的字符串的地址有可能不是4字节(long int)对其的，因此首先需要遍历字符串从而找到4字节对其的那个地址。然后再进行比较.

第二个技巧就是如何高效的判断4个字节中是否有字节为0.

下来来看源码，这个源码的注释还是满详细的。这里主要都是一些位计算的技巧：

size_t
strlen (str)
     const char *str;
{
  const char *char_ptr;
  const unsigned long int *longword_ptr;
  unsigned long int longword, himagic, lomagic;

  /* Handle the first few characters by reading one character at a time.
     Do this until CHAR_PTR is aligned on a longword boundary.  */

  for (char_ptr = str; ((unsigned long int) char_ptr
			& (sizeof (longword) - 1)) != 0;
       ++char_ptr)
    if (*char_ptr == '\0')
      return char_ptr - str;

  /* All these elucidatory comments refer to 4-byte longwords,
     but the theory applies equally well to 8-byte longwords.  */

  longword_ptr = (unsigned long int *) char_ptr;

  /* Bits 31, 24, 16, and 8 of this number are zero.  Call these bits
     the "holes."  Note that there is a hole just to the left of
     each byte, with an extra at the end:

     bits:  01111110 11111110 11111110 11111111
     bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD

     The 1-bits make sure that carries propagate to the next 0-bit.
     The 0-bits provide holes for carries to fall into.  */
  himagic = 0x80808080L;
  lomagic = 0x01010101L;
  if (sizeof (longword) > 4)
    {
      /* 64-bit version of the magic.  */
      /* Do the shift in two steps to avoid a warning if long has 32 bits.  */
      himagic = ((himagic << 16) << 16) | himagic;
      lomagic = ((lomagic << 16) << 16) | lomagic;
    }
  if (sizeof (longword) > 8)
    abort ();

  /* Instead of the traditional loop which tests each character,
     we will test a longword at a time.  The tricky part is testing
     if *any of the four* bytes in the longword in question are zero.  */
  for (;;)
    {
      longword = *longword_ptr++;

      if (((longword - lomagic) & ~longword & himagic) != 0)
	{
	  /* Which of the bytes was the zero?  If none of them were, it was
	     a misfire; continue the search.  */

	  const char *cp = (const char *) (longword_ptr - 1);

	  if (cp[0] == 0)
	    return cp - str;
	  if (cp[1] == 0)
	    return cp - str + 1;
	  if (cp[2] == 0)
	    return cp - str + 2;
	  if (cp[3] == 0)
	    return cp - str + 3;
	  if (sizeof (longword) > 4)
	    {
	      if (cp[4] == 0)
		return cp - str + 4;
	      if (cp[5] == 0)
		return cp - str + 5;
	      if (cp[6] == 0)
		return cp - str + 6;
	      if (cp[7] == 0)
		return cp - str + 7;
	    }
	}
    }
}

代码比较长,不过注释也比较详细,这里主要分析下两个位运算:

  for (char_ptr = str; ((unsigned long int) char_ptr
			& (sizeof (longword) - 1)) != 0;
       ++char_ptr)
    if (*char_ptr == '\0')
      return char_ptr - str;

这里是做一个循环来判断4-byte对齐的那个地址,这里一般的编译器传递进来的字符串指针都是4-byte对其的,因此这步在很多编译器都是跳过的.

下来来看如何在4字节的数字中判断是否有某个字节为0.

  himagic = 0x80808080L;
  lomagic = 0x01010101L;
if (((longword - lomagic) & ~longword & himagic) != 0)

这里分两个部分看.

第一部分(longword - lomagic) 这个表达式是用来判断最高位的,它的意思是: 如果longword的任何一个字节如果大于0x80或者等于0的情况下,它的这个字节的最高位的值都会是不等于0的.

第二部分(~longword & himagic)这个表达式是用来判断longword的每个字节的最高位的设置,也就是判断longword是否小于0x80,如果小于0x80则这个表达式的值是himagic(0x80808080).否则就是0.

因此这两部分合并起来刚好就是如果longword的某个字节小于0x80,并且某个字节为0,则整个表达式的值不为0.

而我们知道传进来的字符串的每个字符的大小的ascii码的范围就是[0,127],也就是肯定是小于0x80的.所以这个表达式就刚好是我们所要的.

4
顶

0
踩

分享到：

ip层和4层的接口实现分析 | linux下ip协议(V4)的实现(五)

2009-08-04 09:10
浏览 4554
评论(13)
论坛回复 / 浏览 (13 / 5417)
分类:编程语言
查看更多

13 楼 Mr.Me 2010-03-24

可能是数学不太好，还是无法理解
himagic = 0x80808080L;
lomagic = 0x01010101L;
if (((longword - lomagic) & ~longword & himagic) != 0)

12 楼 archerzz 2010-02-28

meta 写道

joshzhu 写道

谢谢simohayha兄的考古挖掘工作

第一种写法会将字节>128或=0的情况做同样处理.
第二种写法由 ~longword & himagic保证剔除了字节longword >= 128的被选中特殊处理的情况, 剩余的 & (longword - lowmagic)只需要从 < 128中判断出 = 0的情况做处理即可.

这两种写法应该都没啥太大问题, 因为最终特殊处理代码中还是会去试图找到那个=0的字节.

这个strlen貌似不仅仅统计ASCIId字符长度吧....

第一种写法，等于是跳到后面的if else来剔出〉128的情况，逻辑上不错，但是性能是不是略为差一点呢？我估计那个bug是performance的improvement吧，毕竟在Bug系统里不一定就是bug。
另外，有点好奇，这个实现，真的会快很多吗？其代码里又是for又是if，效率真的快？

11 楼 meta 2010-02-27

joshzhu 写道

谢谢simohayha兄的考古挖掘工作

10 楼 ajonjun 2010-01-15

简单明了，谢谢楼主分享。有机会请教下

9 楼 linac 2009-08-10

这里数学原理：
a-1如果不进位，则最高位不变， ~a的最高位与a相反，
因此(a-1) & ~a的最高位为0。
仅当a=0时，a-1进位，最高位改变，与~a的最高位一致，则(a-1)&~a的最高位为1

明显的一个bug。

8 楼 simohayha 2009-08-05

joshzhu 写道

谢谢simohayha兄的考古挖掘工作

下面是我的几个观点：

1、if (((longword - lomagic) & himagic) != 0) 和 if (((longword - lomagic) & ~longword & himagic) != 0) 这两种写法都是对的，也就是说，先前的glibc没有bug！而新版本的glibc多出的& ~longword从效率上看更慢。因此这个fix是个倒退。

2、两种写法的strlen都可以正确处理值大于128的情况。

3、这段注释也应该去掉
/* Bits 31, 24, 16, and 8 of this number are zero. Call these bits
     the "holes." Note that there is a hole just to the left of
     each byte, with an extra at the end:

     bits: 01111110 11111110 11111110 11111111
     bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDD

     The 1-bits make sure that carries propagate to the next 0-bit.
     The 0-bits provide holes for carries to fall into. */

P.S. 等俺有空给glibc那帮人写封mail？:wink:

哈,我也是觉得有些奇怪,明显多了一次与操作,效率会变慢.

而且去掉~longword,也看不出那里错了,可Ulrich Drepper偏偏就当bug给fix了..支持你发email问他..

7 楼 joshzhu 2009-08-05

谢谢simohayha兄的考古挖掘工作

6 楼 andrew913 2009-08-05

himagic = 0x80808080L;
lomagic = 0x01010101L;

(longword - lomagic) & himagic

貌似比较好理解，不知道为什么被 bug-fix了

如果有中文，貌似和挨个遍历没区别。不知道有没有高效的方法。

另外如果4个字节的并排对比，万一第一个字节就已经是 '\0'了呢？
那不是越界了。

5 楼 andrew913 2009-08-05

始终认为应届生应该来搞搞这些东西。
比做那种XXXXX系统的项目经验有前途多了。

4 楼 simohayha 2009-08-04

joshzhu 写道

64位下8字节呢。BTW，还有另一种写法：
if (((longword - lomagic) & himagic) != 0) {
...
}

P.S.楼主看的glibc是什么版本的？

你的这种写法,貌似在2.10当bug被fix了:

http://sourceware.org/ml/glibc-cvs/2009-q1/msg00512.html

http://sources.redhat.com/bugzilla/show_bug.cgi?id=5807

我看的版本是2.10..

3 楼 joshzhu 2009-08-04

64位下8字节呢。BTW，还有另一种写法：
if (((longword - lomagic) & himagic) != 0) {
...
}

P.S.楼主看的glibc是什么版本的？

2 楼 mryufeng 2009-08-04

glibc的作者也就是这位先生写了如下的书
http://mryufeng.iteye.com/blog/429084

1 楼 andrew913 2009-08-04

strlen，非常精妙。

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论