小测试：两种构造字符串方式的性能对比

AvinDev

浏览: 113355 次

最近访客更多访客>>

一笑而过者也

perfect_control

gbalgs

bolv88

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Erlang

Erlang WordPress HTML Blog

先推荐两篇文章：
http://www.wagerlabs.com/blog/2008/02/parsing-text-an.html
http://ppolv.wordpress.com/2008/02/25/parsing-csv-in-erlang/
（都需要爬墙访问，该死的‘功夫网’）

Erlang中解析文本协议，使用Binary无疑是高效的选择，但是我发现，文章中，对Binary中各个字节组合为字符串，都是使用list的：
NewList = lists:reverse([Char|OldList])
而不是
NewList = binary_to_list(<<OldBin/binary,$Char>>)

稍后我做了个测试，证明了对于大量短字符串的构成，比如将 <<"GET /index.html HTTP/1.1">> 解析为 ["GET","/index.html","HTTP/1.1"]，使用list会更好一些。

简单写了个循环的测试代码：

test_append() -> 
    test_char_append(100),
    test_char_append(1000),
    test_char_append(10000),
    test_char_append(100000),
    test_char_append(1000000),
    test_char_append(10000000),
    test_field_append(10000),
    test_field_append(100000),
    test_field_append(200000),
    test_field_append(300000).

test_char_append(Loop) ->
    erlang:statistics(wall_clock),
    test_char_append_by_list(Loop, []),
    {_,T1} = erlang:statistics(wall_clock),
    test_char_append_by_binary(Loop, <<>>),
    {_,T2} = erlang:statistics(wall_clock),
    io:format("~p loops, test_char_append_by_list using time: ~pms~n", [Loop,T1]),
    io:format("~p loops, test_char_append_by_binary using time: ~pms~n~n", [Loop,T2]),
    ok.

test_field_append(Loop) ->
    erlang:statistics(wall_clock),
    test_field_append_by_list(Loop, []),
    {_,T1} = erlang:statistics(wall_clock),
    test_field_append_by_binary(Loop, []),
    {_,T2} = erlang:statistics(wall_clock),
    io:format("~p loops, test_field_append_by_list using time: ~pms~n", [Loop,T1]),
    io:format("~p loops, test_field_append_by_binary using time: ~pms~n~n", [Loop,T2]),
    ok.
    
test_char_append_by_list(0, List) -> lists:reverse(List);
test_char_append_by_list(N, List) -> test_char_append_by_list(N-1, [$!|List]).

test_char_append_by_binary(0, Bin) -> binary_to_list(Bin);
test_char_append_by_binary(N, Bin) -> test_char_append_by_binary(N-1, <<Bin/binary, $!>>).

test_field_append_by_list(0, List) -> lists:reverse(List);
test_field_append_by_list(N, List) -> 
    Field = test_char_append_by_list(100, []),
    test_field_append_by_list(N-1, [Field|List]).

test_field_append_by_binary(0, List) -> lists:reverse(List);
test_field_append_by_binary(N, List) -> 
    Field = test_char_append_by_binary(100, <<>>),
    test_field_append_by_binary(N-1, [Field|List]).

输出大致如下：

引用

100 loops, test_char_append_by_list using time: 0ms
100 loops, test_char_append_by_binary using time: 0ms

1000 loops, test_char_append_by_list using time: 0ms
1000 loops, test_char_append_by_binary using time: 0ms

10000 loops, test_char_append_by_list using time: 0ms
10000 loops, test_char_append_by_binary using time: 0ms

100000 loops, test_char_append_by_list using time: 16ms
100000 loops, test_char_append_by_binary using time: 16ms

1000000 loops, test_char_append_by_list using time: 203ms
1000000 loops, test_char_append_by_binary using time: 156ms

10000000 loops, test_char_append_by_list using time: 2922ms
10000000 loops, test_char_append_by_binary using time: 1594ms

10000 loops, test_field_append_by_list using time: 62ms
10000 loops, test_field_append_by_binary using time: 172ms

100000 loops, test_field_append_by_list using time: 1109ms
100000 loops, test_field_append_by_binary using time: 1860ms

200000 loops, test_field_append_by_list using time: 2672ms
200000 loops, test_field_append_by_binary using time: 4937ms

300000 loops, test_field_append_by_list using time: 3438ms
300000 loops, test_field_append_by_binary using time: 7062ms

可见当字符串较短时，使用list比binary速度更佳，当字符串达到10w以上（谁没事搞那么长的list？），binary才有一点点的优势。在大量构造短字符串时，还是乖乖用list组合并反转吧

分享到：

Big Endian & Little Endian 笔记 | 对Socket的{active, true}参数进行一些测试

2008-03-05 23:30
浏览 2862
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论