Though Shareaholic is hosted in the AWS cloud, we avoid depending on Amazon’s virtualized cloud services whenever possible. If we ever hit a bottleneck in AWS, I want to be able to switch providers without needing to rebuild a core piece of our architecture. I also don’t want our tech team to have to make product and infrastructure sacrifices just so that we conform to AWS standard practices.
Load balancing with HAProxy was the first example of a service that Amazon provides, that we felt was better to manage ourselves.
Here’s how we did it.
Why HAProxy?
I’ve used HAProxy nonstop since 2009, and to date it has never been the culprit of a single problem I’ve experienced. It is extremely resource efficient and it is impeccably maintained by Willy Tarreau.
It provides both HTTP and TCP load balancing, and a powerful feature set not found in Amazon’s ELB.
Monitoring
HAProxy provides a dashboard with global stats, server stats, and aggregate stats for each frontend and backend. On a single page you can see which servers are busiest, which ones are up and which ones are down. You can also view this data at a service level, so if you have a three node cluster and only two of the servers were down at any point in time, HAProxy correctly summarizes that you did not experience any downtime.
While writing this post, one of our rails boxes had degraded performance due to too many background jobs running. Using the HAProxy dashboard, I quickly spotted it. You probably can too:
(Mouseover the image for the spoiler).
Configurability
HAProxy is highly configurable. You have over a half dozen load balancing algorithms to choose from. You can provide relative weighting for different servers in case you have boxes of differing resources. You can specify small health check timeouts on servers that should resond quickly and longer ones for servers that are slower. You can set these preferences globally, on a per-backend basis or on a per-server basis.
Requests as Mutable Objects
We often think of a web requests as being static, but with HAProxy we can bend them to our will to do important things.
Let’s start with the prototypical load balancer example: adding an X-Forwarded-Proto
header so that our app server knows it’s receiving an SSL request even though it’s coming through on port 80:
reqadd X-Forwarded-Proto:\ https
Now let’s try something we can’t do with ELB: perform HTTP Basic Authentication before a request hits our app server.
userlist admins
user myusername insecure-password mypassword
frontend restricted_cluster
acl auth_tintoretto http_auth(admins)
http-request auth realm ShareaholicRestricted
Any web request being routed to restricted_cluster will need to contain valid HTTP auth credentials in order for that request to be sent to an application server. At Shareaholic we use this for our internal products and dashboards. This lets us manage authentication in a single place, and prevents us from needing to worry about it at the application level.
Frontends
HAProxy supports an infinite number of independently configurable load balancing frontends. In addition to HTTP load balancing, it can be used in TCP mode for general purpose load balancing. This means that if you have a redis database with three read slaves, you can use the same HAProxy instance that balances your web traffic to balance your redis traffic.
Backends and ACLs
Backends and ACLs enable you to host large-scale web sites and web services in ways not possible with simple request forwarding.
For example, did you know that the shareaholic.com domain is hosted partly in a PHP cluster, partly in a Rails cluster, partly in a Python cluster and partly as static content on github.com? We accomplish this with backends and ACLs.
First, we define an ACL based on the path of the URL request. An abbreviated example:
# Tech blog traffic goes to tech blog
acl techblog path_beg /tech
use_backend sharingthetech if techblog
# Other traffic goes to rails app
default_backend rails_app
Then we define backends composed of one or more servers to handle the different kinds of traffic:
backend rails_app :80
option httpchk /haproxy_health_check
server rails-1 10.1.1.1:8080
server rails-2 10.1.1.2:8080
server rails-3 10.1.1.3:8080
backend sharingthetech :80
reqirep ^Host: Host:\ shareaholic.com
server github-pages shareaholic.github.com:80
There are two things to note in this example:
First, our rails servers are configured to receive their health checks at /haproxy_health_check
. These requests are handled by a proc in our routes.rb. We chose this rather than a static file on disk because we want the rails stack to be invoked during the health check; if apache is up but rails isn’t working, we want the health check to fail.
Second, Github pages only works for a single host (CNAME), so in our sharingthetech
backend, we rewrite the Host
header to ensure it is shareaholic.com
, which is what’s contained in our Github Pages CNAME file.
Note: Since writing this post we have moved our tech blog from http://shareaholic.com/tech to http://tech.shareaholic.com. This example illustrates how we had hosted it at http://shareaholic.com/tech.
Shareaholic’s HAProxy Production Setup
Deployment
We run HAProxy on EC2 small boxes backed by instance storage. We do not back up these instances because Chef generates the config files on the fly. We use a custom fork of the haproxy cookbook that provides support for compiling HAProxy from source. We use the latest 1.5-dev12 build.
The shareaholic.com A record points to an Elastic IP address, which we assign to our load balancer. A second load balancer is always running; if the primary load balancer goes down, we simply reassign the Elastic IP address to our backup load balancer.
Because HAProxy consumes negligible CPU cycles and memory when not in use, we save money by avoiding a single tenant backup balancer. Instead we run our backup balancer on a utility box that is doing other work, and that utility box doubles as a hot standby that can handle traffic in a pinch.
HAProxy Configuration
Here is our production haproxy.cfg. The only changes are 1) redactions to protect authentication credentials and 2) Changes to IP addresses and paths to protect potential attack vectors:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
|
SSL Termination
Terminating SSL at the Load Balancer: Why?
Many web sites defer SSL termination to the app servers that handle the individual requests. Because of the way we route traffic at Shareaholic, this is not possible for us. In order to choose a backend based on the request’s path or its Host
header, we need to be able to inspect the request. This is only possible once the request has been decrypted.
An additional benefit of terminating SSL at the load balancer is the operational DRY it provides. Rather than configure SSL termination for JBoss and Apache and nginx and any other app server we run, we can configure it in one place. When it’s time to update our SSL keys, we just drop them into that one place and redeploy the box.
How We Terminate SSL
As of build 1.5-dev12 (September 10, 2012), HAProxy can be configured to perform SSL termination. We however use stud for this task because HAProxy did not have this feature when we began using it. We considered the more popular stunnel but found that stud provided substantially better performance. This is very important to us because SSL termination is the most expensive part of load balancing, and we want to squeeze as many connections as possible out of our EC2 boxes.
I’ve open sourced the stud Chef cookbook that I wrote for deploying stud to our production servers. Even if you don’t use Chef, that cookbook is instructive in how to get stud up and running quickly.
Performance and Scalability
While writing this post, HAProxy reports 161 concurrent connections with a new connection rate of 67/second. It is using about 4.1% of the single EC2 compute unit allocated to our EC2 small box.
stud is consuming an additional 9.7% of the CPU. We force SSL on all pages, so that is with near-100% of traffic flowing over SSL. (I say “near” to account for initial HTTP requests being redirected to their HTTPS equivalents). Based on these numbers, a single EC2 compute unit can handle 1167 concurrent connections, with an arrival rate of 486 connections/second. (RAM is a non-factor; combined, HAProxy and stud are consuming only 73.6 MB).
We can scale vertically on AWS until we need more than 88 EC2 compute units. That means we won’t need to tackle horizontal scalability until we are handling over 100,000 concurrent connections or accepting over 40,000 new connections per second.
If you’re fortunate enough to have that problem, the simplest way to scale horizontally is with Round-Robin DNS. This enables you to point your A record to multiple load balancers straight from your DNS.
Go Give It a Try!
Among our Chef cookbooks and the configuration posted above, we’ve open sourced everything you need to build and deploy your own load balancer, complete with SSL termination and examples for path-based and domain-based routing.
If you’ve ever wanted to host your blog on your application domain, or if you’re trying to avoid yet another provider-specific dependency in your tech stack, I encourage you to give this a try and see how easy it is to do load balancing yourself.
[Source URL: https://tech.shareaholic.com/2012/10/26/haproxy-a-substitute-for-amazon-elb/]
相关推荐
HAProxy is a free and open-source load balancer that enables IT professionals to distribute TCP-based traffic across many backend servers. In this book, the reader will learn how to configure and ...
AWS Elastic Load Balancer 的 DNS 通配符我遇到了一个用例,我快速创建/销毁 ELB 以进行测试/演示,我需要这些 ELB 来支持 DNS 通配符。 也就是说,我希望*.NAME-ID.REGION.elb.amazonaws.com都登陆同一个 ELB,...
2. **Laravel中的负载均衡实现**:Laravel本身并不直接提供负载均衡功能,但可以通过集成第三方工具或使用Nginx、HAProxy、AWS Elastic Load Balancer等负载均衡器来实现。这些工具可以在应用层或者网络层进行负载...
Ansible-ansible-nginx-haproxy-elasticsearch.zip,用于部署的移动配置:nginx、haproxy、elasticsearchmycluster ansible playbook,ansible是一个简单而强大的自动化引擎。它用于帮助配置管理、应用程序部署和任务...
LoadBalancer-Formula 设置和配置基本负载均衡器的公式。 在 CentOS 6.5 上测试。 该公式使用 haproxy 和 keepalived 来创建弹性负载平衡解决方案。 Haproxy 可以对各种前端和后端服务之间的连接进行负载平衡。 ...
在这个主题中,我们将聚焦于"FinalBSD:通过HAProxy,构建开源负载均衡架构平台"这一课件,探讨如何利用开源工具HAProxy在FreeBSD操作系统上实现负载均衡。 HAProxy是一款高性能、轻量级的TCP/HTTP负载均衡器,广泛...
Ansible-Playbook配置反向代理Haproxy负载平衡器和httpd服务器在AWS实例上 每当新的受管节点(使用Apache Webserver配置)加入清单时,使用Ansible剧本来配置反向代理(即Haproxy)并自动更新其配置文件。 使用那里...
乙酰氧基此存储库包含一个设置HAproxy Load Balancer的操作,包括库存和配置文件如何使用Ansible设置HAproxy(负载均衡器)。 读者好,在今天的博客中,我将向您展示如何设置负载均衡器(HAproxy)来平衡Web服务器上...
HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and ... Over the years it has become the de-facto standard opensource load balancer, is...
haproxy_exporter_build_info haproxy_exporter_csv_parse_failures haproxy_exporter_total_scrapes haproxy_frontend_bytes_in_total haproxy_frontend_bytes_out_total haproxy_frontend_compressor_bytes_...
Created symlink from /etc/systemd/system/multi-user.target.wants/haproxy.service to /usr/lib/systemd/system/haproxy.service. [root@web_test ~]# netstat -lntup | grep haproxy tcp 0 0 0.0.0.0:8080 0.0...
Ansible-Playbook配置反向代理Haproxy-Loadbalancer和httpd服务器每当新的受管节点(使用Apache Webserver配置)加入清单时,使用Ansible剧本来配置反向代理(即Haproxy)并自动更新其配置文件。
Elastic HAProxy 是 HAProxy 的现代包装器。 主要兼容 ELB HTTP Api 向 Statsd 报告关键的 HAproxy 指标 动态更新前端和后端(零停机重新加载) 多节点协调,横向扩展到 N 个负载均衡器 Etcd、Route53 和 Docker...
If you're concerned with running your load balancer for an entire network, you'll find out how to set up your network topography, and condense each topographical variety into recipes that will serve ...
Load Balancing with HAProxy原版英文书籍,Open-source technology for better scalability, redundancy and availability in your IT infrastructure
用haproxy实现RDP会话负载均衡 HAProxy是一款免费、快速、可靠的解决方案,提供高可用性、负载均衡和基于TCP和HTTP应用的代理。它支持虚拟主机,运行在当前的硬件上,可以支持数以万计的并发连接。HAProxy特别适用...
5. **启动和管理haproxy**:编译完成后,可以在Cygwin终端使用`/cygdrive/c/haproxy/sbin/haproxy -f /path/to/haproxy.cfg -D`命令启动haproxy服务。`-D`参数让haproxy在后台运行。可以通过`haproxy -f /path/to/...
HAProxy安装与部署 HAProxy 是一种高性能的反向代理服务器软件,提供高可用性、负载均衡和基于 TCP 和 HTTP 应用的代理,支持虚拟主机。HAProxy 是一种免费、快速并且可靠的一种解决方案,特别适用于那些负载特大的...
**haproxy1.7 最新版本** haproxy(High Availability Proxy)是一款开源的、高性能的负载均衡器和反向代理服务器,广泛应用于互联网服务的高可用性和负载分发。haproxy1.7是该软件的一个较新版本,发布于2016年,...
HAProxy提供高可用性、负载均衡以及基于TCP和HTTP应用的代理,支持虚拟主机, 它是免费、快速并且可靠的一种解决方案。... - BUILD: make it possible to look for pcre in the default system paths