- present an overview of the techniques to implement Comet
- introduce Socket.io and a basic application
- discuss the complexities of real-world deployment with Socket.io
So... do you want to build a chat? Or a real-time multiplayer game?
In order to build a (soft-) real-time app, you need the ability to update information quickly within the end user's browser.
HTTP was not designed to support full two-way communication. However, there are multiple ways in which the client can receive information in real time or almost real time:
Techniques to implement Comet
Periodic polling. Essentially, you ask the server whether it has new data every n seconds, and idle meanwhile:
Client: Are we there yet? Server: No Client: [Wait a few seconds] Client: Are we there yet? Server: No Client: [Wait a few seconds] ... (repeat this a lot) Client: Are we there yet? Server: Yes. Here is a message for you.
The problem with periodic polling is that: 1) it tends to generate a lot of requests and 2) it's not instant - if messages arrive during the time the client is waiting, then those will only be received later.
Long polling. This is similar to periodic polling, except that the server does not return the response immediately. Instead, the response is kept in a pending state until either new data arrives, or the request times out in the browser. Compared to periodic polling, the advantage here is that clients need to make fewer requests (requests are only made again if there is data) and that there is no "idle" timeout between making requests: a new request is made immediately after receiving data.
Client: Are we there yet? Server: [Wait for ~30 seconds] Server: No Client: Are we there yet? Server: Yes. Here is a message for you.
This approach is slightly better than periodic polling, since messages can be delivered immediately as long as a pending request exists. The server holds on to the request until the timeout triggers or a new message is available, so there will be fewer requests.
However, if you need to send a message to the server from the client while a long polling request is ongoing, a second request has to be made back to the server since the data cannot be sent via the existing (HTTP) request.
Sockets / long-lived connections. WebSockets (and other transports with socket semantics) improve on this further. The client connects once, and then a permanent TCP connection is maintained. Messages can be passed in both ways through this single request. As a conversation:
Client: Are we there yet? Server: [Wait for until we're there] Server: Yes. Here is a message for you.
If the client needs to send a message to the server, it can send it through the existing connection rather than through a separate request. This efficient and fast, but Websockets are only available in newer, better browsers.
Socket.io
As you can see above, there are several different ways to implement Comet.
Socket.io offers several different transports:
- Long polling: XHR-polling (using XMLHttpRequest), JSONP polling (using JSON with padding), HTMLFile (forever Iframe for IE)
- Sockets / long-lived connections: Flash sockets (Websockets over plain TCP sockets using Flash) and Websockets
Ideally, we would like to use the most efficient transport (Websockets) - but fall back to other transports on older browsers. This is what Socket.io does.
Writing a basic application
I almost left this part out, since I can't really do justice to Socket.io in one section of a full chapter. But here it is: a simple example using Socket.io. In this example, we will write simple echo server that receives messages from the browser and echoes them back.
Let's start with a package.json:
{"name":"siosample","description":"Simple Socket.io app","version":"0.0.1","main":"server.js","dependencies":{"socket.io":"0.8.x"},"private":"true"}
This allows us to install the app with all the dependencies using npm install
.
In server.js:
var fs =require('fs'), http =require('http'), sio =require('socket.io');var server = http.createServer(function(req, res){ res.writeHead(200,{'Content-type':'text/html'}); res.end(fs.readFileSync('./index.html'));}); server.listen(8000,function(){ console.log('Server listening at http://localhost:8000/');});// Attach the socket.io server io = sio.listen(server);// store messagesvar messages =[];// Define a message handler io.sockets.on('connection',function(socket){ socket.on('message',function(msg){ console.log('Received: ', msg); messages.push(msg); socket.broadcast.emit('message', msg);});// send messages to new clients messages.forEach(function(msg){ socket.send(msg);})});
First we start a regular HTTP server that always respondes with the content of "./index.html". Then the Socket.io server is attached to that server, allowing Socket.io to respond to requests directed towards it on port 8000.
Socket.io follows the basic EventEmitter pattern: messages and connection state changes become events on socket
. On "connection", we add a message handler that echoes the message back and broadcasts it to all other connected clients. Additionally, messages are stored in memory in an array, which is sent back to new clients so that the can see previous messages.
Next, let's write the client page (index.html):
<html><head><styletype="text/css">#messages { padding: 0px; list-style-type: none;}#messages li { padding: 2px 0px; border-bottom: 1px solid #ccc; }</style><scriptsrc="http://code.jquery.com/jquery-1.7.1.min.js"></script><scriptsrc="/socket.io/socket.io.js"></script><script> $(function(){var socket = io.connect(); socket.on('connect',function(){ socket.on('message',function(message){ $('#messages').append($('<li></li>').text(message));}); socket.on('disconnect',function(){ $('#messages').append('<li>Disconnected</li>');});});var el = $('#chatmsg'); $('#chatmsg').keypress(function(e){if(e.which ==13){ e.preventDefault(); socket.send(el.val()); $('#messages').append($('<li></li>').text(el.val())); el.val('');}});});</script></head><body><ulid="messages"></ul><hr><inputtype="text"id="chatmsg"></body></html>
BTW, "/socket.io/socket.io.js" is served by Socket.io, so you don't need to have a file placed there.
To start the server, run node server.js
and point your browser tohttp://localhost:8000/. To chat between two users, open a second tab to the same address.
Additional features
Check out the Socket.io website (and Github for server, client) for more information on using Socket.io.
There are two more advanced examples on Github.
I'm going to focus on deployment, which has not been covered in depth.
Deploying Socket.io: avoiding the problems
As you can see above, using Socket.io is fairly simple. However, there are several issues related to deploying an application using Socket.io which need to be addressed.
Same origin policy, CORS and JSONP
The same origin policy is a security measure built in to web browsers. It restricts access to the DOM, Javascript HTTP requests and cookies.
In short, the policy is that the protocol (http vs https), host (www.example.com vs example.com) and port (default vs e.g. :8000) of the request must match exactly.
Requests that are made from Javascript to a different host, port or protocol are not allowed, except via two mechanisms:
Cross-Origin Resource Sharing is a way for the server to tell the browser that a request that violates the same origin policy is allowed. This is done by adding a HTTP header (Access-Control-Allow-Origin:) and applies to requests made from Javascript.
JSONP, or JSON with padding, is an alternative technique, which relies on the fact that the <script> tag is not subject to the same origin policy to receive fragments of information (as JSON in Javascript).
Socket.io supports these techniques, but you should consider try to set up you application in such a way that the HTML page using Socket.io is served from the same host, port and protocol. Socket.io can work even when the pages are different, but it is subject to more browser restrictions, because dealing with the same origin policy requires additional steps in each browser.
There are two important things to know:
First, you cannot perform requests from a local file to external resources in most browsers. You have to serve the page you use Socket.io on via HTTP.
Second, IE 8 will not work with requests that 1) violate the same origin policy (host/port) and 2) also use a different protocol. If you serve your page via HTTP (http://example.com/index.html) and attempt to connect to HTTPS (https://example.com:8000), you will see an error and it will not work.
My recommendation would be to only use HTTPS or HTTP everywhere, and to try to make it so that all the requests (serving the page and making requests to the Socket.io backend) appear to the browser as coming from the same host, port and protocol. I will discuss some example setups further below.
Flashsockets support requires TCP mode support
Flash sockets have their own authorization mechanism, which requires that the server first sends a fragment of XML (a policy file) before allowing normal TCP access.
This means that from the perspective of a load balancer, flash sockets do not look like HTTP and thus require that your load balancer can operate in tcp mode.
I would seriously consider not supporting flashsockets because they introduce yet another hurdle into the deployment setup by requiring TCP mode operation.
Websockets support requires HTTP 1.1 support
It's fairly common for people to run nginx in order to serve static files, and add a proxy rule from nginx to Node.
However, if you put nginx in front of Node, then the only connections that will work are long polling -based. You cannot use Websockets with nginx, because Websockets require HTTP 1.1 support throughout your stack to work and current versions of nginx do not support this.
I heard from the nginx team on my blog that they are working on HTTP 1.1 support (and you may be able to find a development branch of nginx that supports this), but as of now the versions of nginx that are included in most Linux distributions do not support this.
This means that you have to use something else. A common option is to use HAProxy in front, which supports HTTP 1.1 and can thus be used to route some requests (e.g. /socket.io/) to Node while serving other requests (static files) from nginx.
Another option is to just use Node either to serve all requests, or to use a Node proxy module in conjuction with Socket.io, such as node-http-proxy.
I will show example setups next.
Sample deployments: scaling
Single machine, single stack
This is the simplest deployment. You have a single machine, and you don't want to run any other technologies, like Ruby/Python/PHP alongside your application.
[Socket.io server at :80]
The benefit is simplicity, but of course you are now tasking your Node server with a lot of work that it wouldn't need to do, such as serving static files and (optionally) SSL termination.
The first step in scaling this setup up is to use more CPU cores on the same machine. There are two ways to do this: use a load balancer, or use node cluster.
Single machine, dual stack, node proxies to second stack
In this case, we add another technology - like Ruby - to the stack. For example, the majority of the web app is written in Ruby, and real-time operations are handled by Node.
For simplicity, we will use Node to perform the routing.
[Socket.io server at :80] --> [Ruby at :3000 (not accessible directly)]
To implement the route, we will simply add a proxy from the Socket.io server to the Ruby server, e.g. using node-http-proxy or bouncy. Since this mostly app-specific coding, I'm not going to cover it here.
Alternatively, we can use HAProxy:
[HAProxy at :80] --> [Ruby at :3000] --> [Socket.io server at :8000]
Requests starting with /socket.io are routed to Socket.io listening on port 8000, and the rest to Ruby at port 3000.
To start HAProxy, run sudo haproxy -f path/to/haproxy.cfg
The associated HAProxy configuration file can be found here for your cloning and forking convinience.
I've also included a simple test server that listens on ports :3000 (http) and :8000 (websockets). It uses the same ws module that Socket.io uses internally, but with a much simpler setup.
Start the test server using node server.js
, then run the tests using node client.js
. If HAProxy works correctly, you will get a "It worked" message from the client:
$ node client.js Testing HTTP request Received HTTP response: PONG. EOM. Testing WS request Received WS message a Received WS message b Received WS message c Received WS message d Received WS message e Successfully sent and received 5 messages. Now waiting 5 seconds and sending a last WS message. It worked.
Single machine, SSL, dual stack, static assets served separately
Now, let's offload some services to different processes and start using SSL for Socket.io.
First, we will use nginx to serve static files for the application, and have nginx forward to the second technology, e.g. Ruby.
Second, we will start using SSL. Using SSL will increase the complexity of the setup somewhat, since you cannot route SSL-encrypted requests based on their request URI without decoding them first.
If we do not terminate SSL first, we cannot see what the URL path is (e.g. /socket.io/ or /rubyapp/ ). Hence we need to perform SSL termination before routing the request.
There are two options:
- Use Node to terminate SSL requests (e.g. start a HTTPS server).
- Use a separate SSL terminator, such as stunnel, stud or specialized hardware
Using Node is a neat solution, however, this will also increase the overhead per connection in Node (SSL termination takes memory and CPU time from Socket.io) and will require additional coding.
I would prefer not to have to maintain the code for handling the request routing in the Node app - and hence recommend using HAProxy.
Here we will use stunnel (alternative: stud) to offload this work. Nginx will proxy to Ruby only and Socket.io is only accessible behind SSL.
[Nginx at :80] --> [Ruby at :3000] [Stunnel at :443] --> [HAProxy at :4000] --> [Socket.io at :8000] --> [Nginx at :80] --> [Ruby at :3000]
Traffic comes in SSL-encrypted to port 443, where Stunnel removes the encryption, and then forwards the traffic to HAProxy.
HAProxy then looks at the destination and routes requests to /socket.io/ to Node at port 8000, and all other requests to Ruby/Nginx at port 3000.
To run Stunnel, use stunnel path/to/stunnel.conf
.
The associated HAProxy and Stunnel configuration files can be found here for your cloning and forking convinience.
To make connections to port 443 over SSL, run the connection tests for the testing tool using node client.js https
. If your HAProxy + Stunnel setup works correctly, you will get a "It worked" message from the client.
Multiple machines, SSL, dual stack, static assets
In this scenario, we are deploying to multiple machines running Socket.io servers; additionally, we have multiple machines running the second stack, e.g. Ruby.
For simplicity, I'm going to assume that a single machine is going to listen to incoming requests and handle load balancing and SSL decryption for the other servers.
Non-SSL traffic uses a slightly different HAProxy configuration, since non-SSL connections to Socket.io are assumed to be unwanted.
SSL traffic is first de-encrypted by Stunnel, then forwarded to HAproxy for load balancing.
[HAProxy at :80] --> Round robin to Ruby pool [Stunnel at :443] --> [HAProxy at :4000] --> Round robin to Ruby pool --> Source IP based stickiness to Socket.io pool
The configuration here is essentially the same as in the previous scenario (you can use the same config), but instead of having one backend server in each pool, we now have multiple servers and have to consider load balancing behavior.
Load balancing strategy and handshakes
HAproxy is configured with the same (URL-based) routing as in the previous example, but the traffic is balanced over several servers.
Note that in the configuration file, two different load balancing strategies are used. For the second (non-Socket.io) stack, we are using round robin load balancing. This assumes that any server in the pool can handle any request.
With Socket.io, there are two options for scaling up to multiple machines:
First, you can use source IP based sticky load balancing. Source IP based stickiness is needed because of the way Socket.io handles handshakes: unknown clients (e.g. clients that were handshaken on a different server) are rejected by the current (0.8.7) version of Socket.io.
This means that:
- in the event of a server failure, all client sessions must be re-established, since even if the load balancer is smart enough to direct the requests to a new Socket.io server, that server will reject those requests as not handshaken.
- load balancing must be sticky, because for example round robin would result in every connection attempt being rejected as "not handshaken" - since handshakes are mandatory but not synchronized across servers.
- doing a server deploy will require all clients to go through a new handshake, meaning that deploys are intrusive to the end users.
Example with four backend servers behind a load balancer doing round robin:
[client] -> /handshake -> [load balancer] -> [server #1] Your new Session id is 1 [client] -> /POST data (sess id =1) -> [load balancer] -> [server #2] Unknown session id, please reconnect [client] -> /handshake -> [load balancer] -> [server #3] Your new Session id is 2 [client] -> /POST data (sess id =2) -> [load balancer] -> [server #4] Unknown session id, please reconnect
This means that you have to use sticky load balancing with Socket.io.
The second alternative is to use the Stores mechanism in Socket.io. There is a Redis store which synchronizes in memory information across multiple servers via Redis.
Unfortunately, the stores in Socket.io are only a partial solution, since stores rely on a pub/sub API arrangement where all Socket.io servers in the pool receive all messages and maintain the state of all connected clients in-memory. This is not desirable in larger deployments, because the memory usage now grows across all servers independently of whether a client is connected to a particular server (related issue on GitHub).
Hopefully, in the future, Socket.io (or Engine.io) will offer the ability to write a different kind of system for synchronizing the state of clients accross multiple machines. In fact, this is being actively worked on in Engine.io - which will form the basis for the next release of Socket.io. Until then, you have to choose between these two approaches to scale over multiple machines.
相关推荐
计算机网络实时通讯技术是一种将通信技术与计算机网络相结合的技术,它能够在不同的计算机系统之间实时、高速、有效地传输和接收数据,包括图片、文字、声音和视频等多种媒体形式。这种技术的出现和应用极大地缩短了...
《WebRTC音视频实时互动技术》大纲覆盖了WebRTC的核心技术和实践应用,旨在帮助读者深入理解这一实时通信技术。以下是对大纲中重要知识点的详细解释: 1. 音视频服务质量: - 带宽管理:为了确保高质量的音视频...
基于以太网的硬实时通信技术ARTC 基于以太网的硬实时通信技术ARTC是一种新型的通信技术,它结合了以太网的高速、低成本、便捷安装和良好的兼容性,解决了传统实时网络的不足之处。ARTC技术采用命令/响应多路传输和...
【网络实时通信技术在部队中的应用价值】 网络实时通信技术是现代信息时代的关键技术,尤其在军事领域,其重要性不言而喻。随着科技的进步,网络通信技术已从最初的对讲机、电报等传统方式发展到如今的高安全、高...
传统的网页实时通信技术,如轮询和Comet,已经无法满足现代网页应用对于高效、低延迟通信的要求。HTML5的WebSocket技术应运而生,为解决这些问题提供了新的解决方案。 轮询是最基础的实时通信策略,客户端不断地向...
实时流程模拟技术研究中,OPC通讯技术扮演了至关重要的角色,因为它为模拟软件和现场数据之间提供了实时通讯的桥梁。 本研究的背景是将实时的DCS(分布式控制系统)数据传输到AspenPlus这一著名的化工模拟软件中,...
即时通讯技术(IM)是现代通信中不可或缺的一部分,尤其在移动互联网和社交网络日益普及的今天。本文将深入探讨IM技术在不同应用场景下的技术实现与性能调优,并以iOS视角为主进行分析。 ### IM即时通讯技术在不同...
### 实时搜索技术详解 #### 一、实时搜索技术概览 实时搜索技术是指能够即时响应用户查询,并提供最新信息的技术。它广泛应用于各种互联网场景中,如新闻更新、社交媒体互动、在线购物以及旅游预订等领域。去哪儿...
嵌入式实时网络通信技术.pdf 嵌入式实时网络通信技术是当今通信技术的核心之一。该技术的主要要求是...3. 推广应用:第三要推广应用,提高嵌入式实时网络通信技术的应用范围和深度,推广其在不同行业和领域的应用。
【大型仿真系统中的实时通信技术】是计算机工程领域的一个重要课题,主要关注在分布式网络环境中如何确保高效、稳定的数据传输。本文将详细阐述这一技术的关键点。 首先,实时通信技术在大型仿真系统中的核心目标是...
统一通信技术是指将不同的通信方式(如语音、视频、即时消息、电子邮件等)集成到一个单一的、便捷的通信平台中,以提高工作效率和简化通讯流程。然而,任何技术的实施都可能伴随着优点和缺点,下面将详细阐述统一...
在《银河仿真工作站实时通信技术的研究与实现》这篇论文中,国防科技大学的研究人员探讨了在银河仿真工作站上实现与实时网络通信的技术。文章主要关注的是如何在银河仿真工作站(GSW)与共享内存实时网络之间建立...
然而,无人机在巡检过程中产生的海量信息需要实时传输回控制中心,这就对无线通信技术提出了相当高的要求。本文将针对输电线路无人机巡检的实时通信技术进行深入研究,并探索解决问题的技术方案。 在现代通信技术中...
1. **长轮询(Long Polling)**:这是SignalR默认的传输方式之一,它是一种模拟双向通信的技术。服务器保持一个HTTP连接开放,直到有新的消息可发送给客户端,或者连接超时。这种方式在大多数现代浏览器中都表现良好...
Android 实时通信示例是指在 Android 设备上实现实时的通讯功能,例如聊天、视频通话等。这需要使用多种技术和知识点,包括 Java 的 socket 编程、多线程编程、线程通信等。 知识点 1: Java 的 Socket 编程 Socket...
【嵌入式实时网络通信技术】是近年来信息技术发展的重要领域,尤其在通信工程和技术开发中具有广泛应用。这种技术结合了嵌入式系统和实时网络通信的特点,为我国信息技术处理提供了高效、安全的解决方案。 嵌入式...
总的来说,基于VB6.0的上位机与PLC实时通信技术,提供了一种可行的解决方案,以实现工业现场中设备的实时监控和控制。通过深入分析通信方式、数据格式、通信协议以及数据传输的细节,该技术能够使得上位机与PLC之间...
嵌入式实时网络通信技术是现代信息技术中不可或缺的一部分,它涉及到通信工程、技术开发等多个领域,对于提高信息处理的实时性和可靠性具有重要意义。嵌入式操作系统作为这一技术的基础,其优缺点直接影响着通信效果...
该方案通过仿真实验验证了其有效性,实验结果显示,与传统的船舶通信系统相比,基于RFID技术的分布式实时通信系统具有更好的信道容量和抗干扰能力,能够显著提高船舶通信过程的实时性,保障通信系统的可靠性,并解决...
在工业自动化和控制领域,实时通信技术扮演着至关重要的角色。Ethernet POWERLINK作为一种实时工业以太网技术,能够在标准以太网基础上,实现高实时性数据传输。这类技术的引进和发展,是随着工业4.0和智能制造的...