erlang efficient guide 3

zhongwencool

浏览: 28894 次
性别:
来自: 广州

最近访客更多访客>>

gxm2052

xianhhx

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

erlang

erlang

* 3 Common Caveats
* 3常见的注意事项

Here we list a few modules and BIFs to watch out for, and not only from a performance point of view.
下面我们来看-看erlang的一些常用module和bif，不仅仅从性能的角度去考虑。

3.1 The timer module
3.1 timer模块

Creating timers using erlang:send_after/3 and erlang:start_timer/3 is much more efficient than using the timers provided by the timer module. The timer module uses a separate process to manage the timers, and that process can easily become overloaded if many processes create and cancel timers frequently (especially when using the SMP emulator).

使用erlang:send_after/3 和erlang:start_timer/3会比使用time模块内的函数高效得多。timer 模块使用一个独立的进程来管理时间，这个进程如果大量的进程频繁去创建和取消时间很容易就超负荷了（特别是使用SMP方式的时候）

The functions in the timer module that do not manage timers (such as timer:tc/3 or timer:sleep/1), do not call the timer-server process and are therefore harmless.

有一些timer内不管理时间定时器的函数（例如：timer:tc/3, timer:sleep/1)，不去调用定时器server进程的函数都是无害的。【也就是说：有些timer里面的函数是不依赖于这个server的，可以随意用】
【附】关于erlang:send_after/3与erlang:send_time/3的区别【cancle_timer/1只能去掉那些没有超时的信息，没发出去的】
http://www.cnblogs.com/me-sa/archive/2012/03/16/erlang-timer.html
这时候send_after里面存放的是Msg, 那用户如何知道Msg是对于那个TimerRef的呢? 读者可能说, 那我可以在消息里面加入TimerRef. 这个主意不错, 但是问题是在send_after调用返回之前, 你是无法得到TimerRef, 当然也就无从构造这个消息, 那就无法处理这个可能的超时信息, 就会破坏逻辑.
所以erts version 5.4.11 引入了, start_timer来解决这个问题. 它是自动的在超时后, 要发送消息前, 在消息里面添加了{timeout, TimerRef, Msg}, 达到识别的目的

erlang使用timer有3中方式：
1. 语法层面的 receive ... after ...
这个是opcode实现的，一旦timeout立即把process加到调度队列，使用频度比较高。
2. bif
erlang:send_after(Time, Dest, Msg) -> TimerRef
erlang:start_timer(Time, Dest, Msg) -> TimerRef
这个一旦timeout就给Dest发送一条消息，使用频度不是很高。
3.driver层面的。
int driver_set_timer(ErlDrvPort port, unsigned long time);
inet_driver大量使用这个api. tcp/udp进程需要超时处理，所以有大量的连接的时候这种timer的数量非常大。定时器超时后把port_task加到调度队列。
定时器的最早超时时间用于poll的等待时间。
整个定时器由bump_timer来驱动。bump_timer是在schedule的时候不定期调用的。总之使用timer的时候要小心，因为timer实在scheduler的线程里面调用的，不能做非常耗时的操作，否则会阻塞调度器。
【因为项目中有大量的秒循环，，如果可以使用一个总的send_after/3来给驱动各个子的秒循环（用一个管理秒循环的每秒给各服务发一个信息，来做比较自己服务器起一个erlang:send_after/3），就不用频繁的调用erlang:send_after/3了，听项目负责人是这样说的。但是没去看。。。。。】
Ps 最后一篇的erl -s module func arg1 arg2 很吊哦。

3.2 list_to_atom/1
3.2 list_to_atom/1

Atoms are not garbage-collected. Once an atom is created, it will never be removed. The emulator will terminate if the limit for the number of atoms (1048576 by default) is reached

原子是不会被垃圾回收器回收的，原子一旦创建，不会被移除，如果原子的数量达到最大值（默认为1048576 ）emulator(模拟器？)就会挂掉。

Therefore, converting arbitrary input strings to atoms could be dangerous in a system that will run continuously. If only certain well-defined atoms are allowed as input, you can use list_to_existing_atom/1 to guard against a denial-of-service attack. (All atoms that are allowed must have been created earlier, for instance by simply using all of them in a module and loading that module.)

所以，将任意字符串转化为原子在一个持续动作的系统时是非常危险的。如果只有定义好的原子能够作为输入，你可以使用list_to_exiting_atom/1来作为一种避免出现危机的手段。（所以的允许的原子必须创建早些，比如：通过简单的定义所有的原子在同一模块并加载此模块来实现。）

Using list_to_atom/1 to construct an atom that is passed to apply/3 like this
apply(list_to_atom("some_prefix"++Var), foo, Args)

以上是错的行为：
is quite expensive and is not recommended in time-critical code.
在apply/3的参数中使用list_to_atom/1把string（也是一种list）转化为原子来调用。是非常耗性能的，非常不推荐啊

3.3 length/1
3.3 length/1求列表长

The time for calculating the length of a list is proportional to the length of the list, as opposed to tuple_size/1, byte_size/1, and bit_size/1, which all execute in constant time.

执行length/1的时间取决于列表的长度，矶tuple_size/1,byte_size/1, and bit_size/1则是一个固定的值

Normally you don't have to worry about the speed of length/1, because it is efficiently implemented in C. In time critical-code, though, you might want to avoid it if the input list could potentially be very long.

正常情况下，你不用担心length/1的执行速度，因为他是用c写的高效代码，在关键性的代码时，一定要禁止在列表很长的时候调用length/1。

Some uses of length/1 can be replaced by matching. For instance, this code

一些用length/1的地方可以用模式匹配来做，例如：

foo(L) when length(L) >= 3 ->
    ...

can be rewritten to

foo([_,_,_|_]=L) ->
  ...

(One slight difference is that length(L) will fail if the L is an improper list, while the pattern in the second code fragment will accept an improper list.)

（一点不同之处在于：length（L)会在L不是一个正当的列表时失败，而第二种情况则能适应不正当的列表。）

3.4 setelement/3
3.4 setelement/3 setelement(Index, Tuple1, Value) -> Tuple2

setelement/3 copies the tuple it modifies. Therefore, updating a tuple in a loop using setelement/3 will create a new copy of the tuple every time.

setelement/3 会复制那个要修改的元组，在每一次更新用setelement/3都会创建一个新的元组哦。

There is one exception to the rule that the tuple is copied. If the compiler clearly can see that destructively updating the tuple would give exactly the same result as if the tuple was copied, the call to setelement/3 will be replaced with a special destructive setelement instruction. In the following code sequence

不会复制的唯一例外：当编译器清楚知道更新后的元组和原来一样时，那么使用setelement/3会被一种特殊的setelement 结构所替代，请看下面的代码

multiple_setelement(T0) ->
    T1 = setelement(9, T0, bar),
    T2 = setelement(7, T1, foobar),
    setelement(5, T2, new_value).

the first setelement/3 call will copy the tuple and modify the ninth element. The two following setelement/3 calls will modify the tuple in place.

第一个setelement/3函数会复制元组并修改第九个元素。后面两个setelement/3函数会直接修改这个元组。

For the optimization to be applied, all of the followings conditions must be true:

要让上面的优化起作用，下面的原则是要必须遵守的：

* The indices must be integer literals, not variables or expressions.
* The indices must be given in descending order.
* There must be no calls to other function in between the calls to setelement/3.
* The tuple returned from one setelement/3 call must only be used in the subsequent call to setelement/3.

* 索引必须是一个合法的数字，不是变量，或表达式
* 索引必须是降序的
* 在使用setelement/3的中间不要调用其它任务函数
* 一个setelement/3后面的结果必然是下一个setelement/3的输入

If it is not possible to structure the code as in the multiple_setelement/1 example, the best way to modify multiple elements in a large tuple is to convert the tuple to a list, modify the list, and convert the list back to a tuple.

multiple_setelement/1 的例子并不多见，修改元组中的多个元素通常的情况是：把一个大的元组转化为一个列表，然后修改列表，再转化为元组。

3.5 size/1
3.5 size/1

size/1 returns the size for both tuples and binary.
size/1求元组和二进制的大小

Using the new BIFs tuple_size/1 and byte_size/1 introduced in R12B gives the compiler and run-time system more opportunities for optimization. A further advantage is that the new BIFs could help Dialyzer find more bugs in your program.

使用R12B更新的BIFs tuple_size/1 和byte_size/1 会让运行时系统更加高效，使用新的bifs的一个长久的优点在于Dialyzer 可以帮助程序发现更多隐藏的bug.

3.6 split_binary/2
3.6 split_binary/2

It is usually more efficient to split a binary using matching instead of calling the split_binary/2 function. Furthermore, mixing bit syntax matching and split_binary/2 may prevent some optimizations of bit syntax matching.

使用比特语法来分离二进制比用split_binary/2更加高效，更进一步来讲，混合使用比特语法和split_binary/2会把编译器弄sb的（不会进行相关的优化任务了）

 <<Bin1:Num/binary,Bin2/binary>> = Bin,

DO NOT

 {Bin1,Bin2} = split_binary(Bin, Num)

3.7 The '--' operator
3.7 “--”操作符

Note that the '--' operator has a complexity proportional to the product of the length of its operands, meaning that it will be very slow if both of its operands are long lists:

Note: "--"操作符非常依赖于列表的长度，长度越长越慢！

DO NOT

 HugeList1 -- HugeList2

Instead use the ordsets module:

DO

        HugeSet1 = ordsets:from_list(HugeList1),
        HugeSet2 = ordsets:from_list(HugeList2),
        ordsets:subtract(HugeSet1, HugeSet2)

Obviously, that code will not work if the original order of the list is important. If the order of the list must be preserved, do like this:

显然，上面的代码不能保证原来列表的顺序，如果要保留，则用下面的代码：

Set = gb_sets:from_list(HugeList2),
        [E || E <- HugeList1, not gb_sets:is_element(E, Set)]

Subtle note 1: This code behaves differently from '--' if the lists contain duplicate elements. (One occurrence of an element in HugeList2 will remove all occurrences in HugeList1.)

注意事项1：这与--不同的在于:如果原列表中有重复的元素都会被移除

Subtle note 2: This code compares lists elements using the '==' operator, while '--' uses the '=:='. If that difference is important, sets can be used instead of gb_sets, but note that sets:from_list/1 is much slower than gb_sets:from_list/1 for long lists.

注意事项2：这使用'=='来比较列表元素，而'--'操作符使用'=:='来比较列表元素。如果这个区别显得很重要，那么可以用sets模块来替代gb_sets模块，但是记住，对于长列表，sets:from_list/1函数比gb_sets:from_list/1函数要慢得多。

Using the '--' operator to delete an element from a list is not a performance problem:

使用 --来删除列表中的一个元素是没的性能问题的。

 HugeList1 -- [Element]

查看图片附件

分享到：

window emacs esense 安装 | erlang efficient guide 2

2013-08-19 22:19
浏览 1119
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论