Erlang中频繁发送远程消息要注意的问题

AvinDev

浏览: 113687 次

最近访客更多访客>>

一笑而过者也

perfect_control

gbalgs

bolv88

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Erlang

Erlang Socket 算法网游 UP

注：这篇文章可能会有争议，欢迎提出意见

在Erlang中，如果要实现两个远程节点之间的通信，就需要通过网络来实现，对于消息发送，是使用TCP。如果要在两个节点间频繁发送消息，比如每秒几百上千条，那样就要注意了。

无论是网游服务器开发的书籍，或是经验老道的工程师，都会告诉你，在发送数据包时，尽可能把小的消息组合为一个比较大的包来发送，毕竟一个TCP包的头也很大，首先是浪费带宽，其次调用底层发送的指令也是有开销的。有工程师告诉我，一般每秒大概是2W次左右。

简单测试一下，先是代码

一个接收消息并马上抛弃的Server：

start() ->
    register(nullserver, self()),
    loop().

loop() ->
    receive
	Any ->
	    loop() %drop message and loop
    end.

一个在循环中向它发送消息的Client：

start() ->
    start_send(100).

start_send(0) ->
    ok;
start_send(N) ->
    {nullserver, 'foo@192.168.0.3'} ! hi,
    start_send(N-1).

然后打开截包工具，运行server和client，截取到接近200个包的发送和接收记录，其中，大部分是这样的数据：

引用

00 14 78 B9 14 BC 00 11-11 9F 91 1A 08 00 45 00
00 45 EE 77 40 00 80 06-80 E4 C0 A8 00 CC DB E8
ED F9 13 58 C1 C6 AA 4E-59 F2 38 CF 22 2D 50 18
FF 19 B9 EE 00 00 00 00-00 19 70 83 68 04 61 06
67 43 CC 00 00 00 01 00-00 00 00 02 43 05 43 BD
83 43 BF

引用

00 14 78 B9 14 BC 00 11-11 9F 91 1A 08 00 45 00
00 45 EE 78 40 00 80 06-80 E3 C0 A8 00 CC DB E8
ED F9 13 58 C1 C6 AA 4E-5A 0F 38 CF 22 2D 50 18
FF 19 B9 D1 00 00 00 00-00 19 70 83 68 04 61 06
67 43 CC 00 00 00 01 00-00 00 00 02 43 05 43 BD
83 43 BF

实际上，只有从 00 00-00 19 这里开始，才是TCP包的内容，前面都是底层协议的数据，就是这样的数据包发送了100次，浪费是巨大的。而且，在消息发送后，还收到同样数目类似

引用

00 11 11 9F 91 1A 00 14-78 B9 14 BC 08 00 45 00
00 28 8C FC 40 00 32 06-30 7D DB E8 ED F9 C0 A8
00 CC C1 C6 13 58 38 CF-22 2D AA 4E 59 F2 50 10
19 20 D7 01 00 00 00 00-00 00 00 00

这样的响应包，也浪费着带宽。

从目前我所阅读过的文档来看，暂时没有有关如何缓存这些消息定期一并发送的参数设置。那么有什么解决办法，我自己有两种。

一种是将要发送的一批Message打包到一个list发送，接收方从list中取出所有message并处理。

另一种是通过一个Proxy，发送方不通过 {Name, Node} ! Message 这种方式来发送，而是通过一个本地的Proxy Process，代理会将所有发送到某个节点的消息累积起来，定时批量发送过去；接收方也有一个Listening Process，它接收批量的Message，遍历后发送给本地的相应进程。

这里是我初步写出来的实现，不太漂亮，仅供参考～

message_agent.erl: 实现消息的批量发送，接收和转发

-module(message_agent).
-export([listen/0, proxy/2, block_exit/1]).
-export([loop_receive/0]).
-define(MAX_BATCH_MESSAGE_SIZE, 50).

listen() ->
    io:format("Message agent server start listen~n"),
    spawn(fun() -> register('MsgServerAgent', self()), loop_receive() end),
    ok.

loop_receive() ->
    receive
	{forward_message, PName, Messages} ->
	    forward_messages(PName, Messages),
            loop_receive();
	Any ->
	    message_agent:loop_receive()
    end.

forward_messages(PName, []) ->
    ok;
forward_messages(PName, [H|T]) ->
    %io:format("Forward message ~w to process ~w~n", [H, PName]),
    catch PName ! H,
    forward_messages(PName, T).


proxy(Node, PName) ->
    spawn_link(fun() -> handle_message_forward(Node, PName, []) end).

block_exit(Agent) ->
    Agent ! {block_wait, self()},
    receive
	{unblock} ->
	    ok
    end.

handle_message_forward(Node, PName, Messages) ->
    receive
	{block_wait, Pid} ->
	    catch send_batch(Node, PName, lists:reverse(Messages)),
	    Pid ! {unblock};
	Any ->
	    NewMessages = [Any|Messages],
	    case length(NewMessages)>=?MAX_BATCH_MESSAGE_SIZE of
		true ->
		    send_batch(Node, PName, lists:reverse(NewMessages)),
		    handle_message_forward(Node, PName, []);
		false ->
		    handle_message_forward(Node, PName, NewMessages)
	    end
    after
	0 ->
	    case length(Messages)>0 of
		true ->
		    catch send_batch(Node, PName, lists:reverse(Messages));
		false ->
		    ok
	    end,
	    handle_message_forward(Node, PName, [])
    end.

send_batch(Node, PName, Messages) ->
    %io:format("Send batch message, size ~p~n", [length(Messages)]),
    {'MsgServerAgent', Node} ! {forward_message, PName, Messages}.

使用方式很简单，在接收Message的一端调用 message_agent:listen() 启动监听代理，客户端使用 register(agent, message_agent:proxy(?NODE, 'MsgServer')) 的方式启动代理进程，消息发送给这个代理进程就可以了。下面是我写的简单例子：

-module(message_server).
-export([start/0]).
-define(TIMEOUT_MS, 1000).

start() ->
    io:format("Message server start~n"),
    register('MsgServer', self()),
    message_agent:listen(),
    loop_receive(0).

loop_receive(Count) ->
    receive
	Any ->
	    %io:format("Receive msg ~w~n", [Any]),
            loop_receive(Count+1)
    after
	?TIMEOUT_MS ->
	    if 
		Count>0 ->
		    io:format("Previous receive msg count: ~p~n", [Count]),
		    loop_receive(0);
		true ->
		    loop_receive(0)
	    end
    end.

-module(message_client).
-define(NODE, 'msgsrv@192.168.0.3').
-define(COUNT, 20000).
-export([start/0]).

start() ->
    statistics(wall_clock),
    register(agent, message_agent:proxy(?NODE, 'MsgServer')),
    send_loop(?COUNT).

send_loop(0) ->
    message_agent:block_exit(agent),
    {_, Interval} = statistics(wall_clock),
    io:format("Finished ~p sends in ~p ms, exiting...~n", [?COUNT, Interval]);
send_loop(Count) ->
    agent ! {self(), lalala},
    send_loop(Count-1).

这里要注意的是，消息发送端和接收端都是由一个单独的进程来处理消息。在Erlang的默认堆实现，是私有堆，本地进程间的消息发送是需要拷贝的，在数据量大的时候，该进程堆的垃圾回收会相当频繁。

分享到：

用Erlang写了个解八数码的小程序 | Erlang 里面使用Remote shell

2007-05-01 21:09
浏览 5384
评论(8)
查看更多

8 楼 lzy.je 2009-02-25

mryufeng 写道

to lzy.je:1. 写的时候是先尝试写对端堵塞的时候才挂写事件的等对端解除堵塞马上发起写操作怎么不提高吞吐量。2. 和nagle毫无关系完全是inet drv自己实现的!

1. 嗯，一直同意吞吐量会提高，但这种“半异步”的处理方式，对响应时间的帮助不大。个人意见。
2. 了解了。

7 楼 mryufeng 2009-02-25

to lzy.je:
1. 写的时候是先尝试写对端堵塞的时候才挂写事件的等对端解除堵塞马上发起写操作怎么不提高吞吐量。
2. 和nagle毫无关系完全是inet drv自己实现的!

6 楼 lzy.je 2009-02-25

mryufeng 写道

to  lzy.je: 这个不是关闭 nagle算法而是在内部缓存数据登记可写事件  在对端可写的时候才发生写操作所以说delay 但是这样极大提高吞吐量！blog.yufeng.info

to mryufeng:
1.这种方式来提高吞吐量并不意味着响应时间的缩短，因此对于socket client来说，是不能真实性能的。这相当于引入缓存机制。
2. 想请教，这种{delay_send, Boolean}方式下，erlang实现的方式不同于这样么？
setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, ....)
是完全由erlang driver内部实现？

5 楼 mryufeng 2009-02-24

to lzy.je: 这个不是关闭 nagle算法而是在内部缓存数据登记可写事件在对端可写的时候才发生写操作所以说delay 但是这样极大提高吞吐量！

blog.yufeng.info

4 楼 AvinDev 2009-02-11

是的，具体情况还要权衡

3 楼 lzy.je 2009-02-11

AvinDev 写道

发现 inet:setopts 有一个选项可以解决这个问题：

引用{delay_send, Boolean}
Normally, when an Erlang process sends to a socket, the driver will try to immediately send the data. If that fails, the driver will use any means available to queue up the message to be sent whenever the operating system says it can handle it. Setting {delay_send, true} will make all messages queue up. This makes the messages actually sent onto the network be larger but fewer. The option actually affects the scheduling of send requests versus Erlang processes instead of changing any real property of the socket. Needless to say it is an implementation specific option. Default is false.

有空试试

这种方式将禁用Nagle算法，通过对数据合并尽管有效减少了发送的次数及数据量，但会引入一些延时。

2 楼 AvinDev 2007-06-11

发现 inet:setopts 有一个选项可以解决这个问题：

引用

{delay_send, Boolean}
Normally, when an Erlang process sends to a socket, the driver will try to immediately send the data. If that fails, the driver will use any means available to queue up the message to be sent whenever the operating system says it can handle it. Setting {delay_send, true} will make all messages queue up. This makes the messages actually sent onto the network be larger but fewer. The option actually affects the scheduling of send requests versus Erlang processes instead of changing any real property of the socket. Needless to say it is an implementation specific option. Default is false.

有空试试

1 楼 potian 2007-05-11

这在Joe的新书里面叫做 socket based distribution, 代码在
http://media.pragprog.com/titles/jaerlang/code/socket_dist/目录下面，上面的连接对中国人封锁，你可以用Tor上去

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论