`

What are the differences between Apache Kafka and RabbitMQ?

 
阅读更多

转自:https://www.quora.com/What-are-the-differences-between-Apache-Kafka-and-RabbitMQ

 

Kafka is a general purpose message broker, like RabbItMQ, with similar distributed deployment goals, but with very different assumptions on message model semantics.   I would be skeptical of the "AMQP is more mature" argument and look at the facts of how either solution solves your problem.

TL;DR, 

a) Use Kafka if you have a fire hose of events (100k+/sec) you need delivered in partitioned order 'at least once' with a mix of online and batch consumers, you want to be able to re-read messages, you can deal with current limitations around node-level HA (or can use trunk code), and/or you don't mind supporting incubator-level software yourself via forums/IRC.  

b) Use Rabbit if you have messages (20k+/sec) that need to be routed in complex ways to consumers, you want per-message delivery guarantees, you don't care about ordered delivery, you need HA at the cluster-node level now, and/or you need 24x7 paid support in addition to forums/IRC.

Neither offers great "filter/query" capabilities - if you need that, consider using Storm on top of one of these solutions to add computation, filtering, querying, on your streams.   Or use something like Cassandra as your queryable cache.   Kafka is also definitely not "mature" even though it is "production ready". 

Details (caveat - my opinion, I've not used either in great anger, and I have more exposure to RabbitMQ)

Firstly, on RabbitMQ vs. Kafka.   They are both excellent solutions, RabbitMQ being more mature, but both have very different design philosophies.    Fundamentally, I'd say RabbitMQ is broker-centric, focused around delivery guarantees between producers and consumers, with transient preferred over durable messages.   Whereas Kafka is producer-centric, based around partitioning a fire hose of event data into durable message brokers with cursors, supporting batch consumers that may be offline, or online consumers that want messages at low latency.  

RabbitMQ uses the broker itself to maintain state of what's consumed (via message acknowledgements) - it uses Erlang's Mnesia to maintain delivery state around the broker cluster.  Kafka doesn't have message acknowledgements, it assumes the consumer tracks of what's been consumed so far.   Both Kafka brokers & consumers use Zookeeper to reliably maintain their state across a cluster.

RabbitMQ presumes that consumers are mostly online, and any messages "in wait" (persistent or not) are held opaquely (i.e. no cursor).  RabbitMQ pre-2.0 (2010) would fall over if your consumers were too slow, but now it's robust for online and batch consumers - but clearly large amounts of persistent messages sitting in the broker was not the main design case for AMQP in general.   Kafka was based from the beginning around both online and batch consumers, and also has producer message batching - it's designed for holding and distributing large volumes of messages.  

RabbitMQ provides rich routing capabilities with AMQP 0.9.1's exchange, binding and queuing model.   Kafka has a very simple routing approach - in AMQP parlance it uses topic exchanges only.  

Both solutions run as distributed clusters, but RabbitMQ's philosophy is to make the cluster transparent, as if it were a virtual broker.   Kafka makes it explicit, by forcing the producer to know it is partitioning a topic's messages across several nodes, this has the benefit of preserving ordered delivery within a partition, which is richer than what RabbitMQ exposes, which is almost always unordered delivery (the AMQP 0.9.1 model says "one producer channel, one exchange, one queue, one consumer channel" is required for in-order delivery).   

Put another way, Kafka presumes that producers generate a massive stream of events on their own timetable - there's no room for throttling producers because consumers are slow, since the data is too massive.  The whole job of Kafka is to provide the "shock absorber" between the flood of events and those who want to consume them in their own way -- some online, others offline - only batch consuming on an hourly or even daily basis.   Kafka can deliver "at least once" semantics per partition (since maintains delivery order), just like RabbitMQ, but it does it in a very different way.  

Performance-wise, if you require ordered durable message delivery, currently it looks like there's no comparison:  Kafka currently blows away RabbitMQ in terms of performance on synthetic benchmarks.   This paper indicates 500,000 messages published per second and 22,000 messages consumed per second on a 2-node cluster with 6-disk RAID 10.  
http://research.microsoft.com/en...

Of course this was written by the LinkedIn guys without necessarily expert RabbitMQ input, so YMMV.

Finally, a reminder:  Kafka is an early Apache incubator project.  It doesn't necessarily have all the hard-learned aspects in RabbitMQ. 

Now, a word on AMQP.   Frankly, it seems the standard is a mess.   Officially there is a 1.0 proposed specification that is going through the OASIS standards process.  In practice it is a forked standard, one (0.9.1) supported by vendors, the other (1.0) supported by the working group.    A set of generally available, widely-adopted, production-quality AMQP 1.0 implementations across the major releases (Qpid from Redhat, RabbitMQ, etc.) won't exist until 2013, if ever.

As an external observer with no inside knowledge, here is what it looks like:   the working group spent 5 years on a spec, from 2003 to 2008, culminating in a widely adopted release (0.9.1).   Then a subset of more powerful working group members rewrote the spec by late 2011, completely shifting the focus of the spec from a messaging model to a transport protocol (sort of like TCP++), and declared it 1.0.    So, we have the strange case where the "mature" AMQP is the non-standard 0.9.1 specification and the "immature" AMQP is the actual 1.0 standard.    

This isn't to suggest 1.0 isn't good technology, it likely is, but that it's a much lower-level spec than AMQP intended to be for most of its published life, and is not widely supported yet beyond prototypes and one GA implementation that I know of (IIT SwiftMQ). The RabbitMQ folks have a prototype that has layers the 0.9.1 model on top of 1.0 but have not committed to a GA timeframe.

So, in my opinion, AMQP has lost some of its sheen, as while there's ample evidence it is interoperable from the various connect-fests over the years, the standards politics have delayed the official standard and called into question its widespread support.   On the bright side, one can argue that AMQP has already succeeded in its goal of helping to break the hold TIBCO had on high performance, low latency messaging through 2007 or so.   Now there are many options.  Bet on the broker you choose to use, and don't expect bug-free interoperability for a few years (if ever).

分享到:
评论

相关推荐

    [net毕业设计]ASP.NET基于BS结构的实验室预约模型系统(源代码+论文).zip

    【项目资源】:包含前端、后端、移动开发、操作系统、人工智能、物联网、信息化管理、数据库、硬件开发、大数据、课程资源、音视频、网站开发等各种技术项目的源码。包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】:所有源码都经过严格测试,可以直接运行。功能在确认正常工作后才上传。【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。【附加价值】:项目具有较高的学习借鉴价值,也可直接拿来修改复刻。对于有一定基础或热衷于研究的人来说,可以在这些基础代码上进行修改和扩展,实现其他功能。【沟通交流】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。鼓励下载和使用,并欢迎大家互相学习,共同进步。

    中医诊所系统,WPF.zip

    中医诊所系统,WPF.zip

    [net毕业设计]ASP.NET淘宝店主交易管理系统的设计与实现(源代码+论文).zip

    【项目资源】:包含前端、后端、移动开发、操作系统、人工智能、物联网、信息化管理、数据库、硬件开发、大数据、课程资源、音视频、网站开发等各种技术项目的源码。包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】:所有源码都经过严格测试,可以直接运行。功能在确认正常工作后才上传。【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。【附加价值】:项目具有较高的学习借鉴价值,也可直接拿来修改复刻。对于有一定基础或热衷于研究的人来说,可以在这些基础代码上进行修改和扩展,实现其他功能。【沟通交流】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。鼓励下载和使用,并欢迎大家互相学习,共同进步。

    1-全国各省、297个地级市公路里程面板数据1999-2021年-社科数据.zip

    全国各省、297个地级市公路里程面板数据1999-2021年涵盖了中国各地区公路建设的详细情况,是衡量地区基础设施水平的重要指标。这些数据不仅包括了全国31个省份的公路里程,还深入到了297个地级市的层面,提供了从1999年至2021年的连续年份数据。这些数据来源于各省统计年鉴、经济社会发展统计数据库、地级市统计年鉴以及地级市发展统计公报,确保了数据的准确性和权威性。通过这些数据,可以观察到中国公路交通建设的发展不平衡性,沿海地区和长江中下游地区公路交通密度较高,而西部地区相对较低。这些面板数据为研究中国城市化进程、区域经济发展以及交通基础设施建设提供了宝贵的信息资源。

    技术处工作事项延期完成申请单.docx

    技术处工作事项延期完成申请单.docx

    数据库详细设计说明书中文最新版本

    本文为图书馆管理课程设计SQL Server功能规范说明书。本说明书将: 描述数据库设计的目的; 说明数据库设计中的主要组成部分; 说明数据库设计中各功能的实现。 本文档主要内容包括对数据库设计结构的总体描述,对数据库中各种对象的描述(包括对象的名称、对象的属性、对象和其他对象直接的关系);在数据库主要对象之外,本文还将描述数据库安全性设置、数据库属性设置和数据库备份策略,为数据库管理员维护数据库安全稳定地运行提供参考;有需要的朋友可以下载看看

    WebSocketError(解决方案).md

    项目中常见的问题,记录一下解决方案

    octopart(样本).csv

    octopart数据格式样例

    [net毕业设计]ASP.NET通用作业批改系统设计(源代码+论文).zip

    【项目资源】:包含前端、后端、移动开发、操作系统、人工智能、物联网、信息化管理、数据库、硬件开发、大数据、课程资源、音视频、网站开发等各种技术项目的源码。包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】:所有源码都经过严格测试,可以直接运行。功能在确认正常工作后才上传。【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。【附加价值】:项目具有较高的学习借鉴价值,也可直接拿来修改复刻。对于有一定基础或热衷于研究的人来说,可以在这些基础代码上进行修改和扩展,实现其他功能。【沟通交流】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。鼓励下载和使用,并欢迎大家互相学习,共同进步。

    Oracle11gRAC安装与配置forLinux中文最新版本

    本文档主要讲述的是Oracle 11g RAC安装与配置for Linux;希望对大家的学习会有帮助 文档结构 第一部分:Oracle Grid Infrastructure安装 第二部分:Oracle Clusterware与Oracle Real Application Clusters安装前准备规程 第三部分:安装Oracle Clusterware与Oracle Real Application Clusters 第四部分:Oracle Real Application Clusters环境配置 第五部分:Oracle Clusterware与Oracle Real Application Clusters参考资料

    python教程.txt

    python教程.txt

    脸部痤疮检测数据集VOC+YOLO格式3763张7类别.zip

    文件太大放服务器下请务必到资源详情查看后然后下载 样本图:blog.csdn.net/2403_88102872/article/details/143979016 重要说明:数据集为小目标检测,训练map精度偏低属于正常现象,只要能检测出来即可。如果map低于0.5请勿奇怪,因为小目标检测是业界公认难检测的研究方向之一。 数据集格式:Pascal VOC格式+YOLO格式(不包含分割路径的txt文件,仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件) 图片数量(jpg文件个数):3763 标注数量(xml文件个数):3763 标注数量(txt文件个数):3763 标注类别数:7 标注类别名称:["blackheads","cyst","fore","nodule","papule","pustule","whiteheads"]

    ASP+ACCESS基于WEB社区论坛设计与实现(源代码+论文)(源代码+论文+说明文档).zip

    【项目资源】:包含前端、后端、移动开发、操作系统、人工智能、物联网、信息化管理、数据库、硬件开发、大数据、课程资源、音视频、网站开发等各种技术项目的源码。包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】:所有源码都经过严格测试,可以直接运行。功能在确认正常工作后才上传。【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。【附加价值】:项目具有较高的学习借鉴价值,也可直接拿来修改复刻。对于有一定基础或热衷于研究的人来说,可以在这些基础代码上进行修改和扩展,实现其他功能。【沟通交流】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。鼓励下载和使用,并欢迎大家互相学习,共同进步。

    1-全国各地级市固定资产投资统计数据(附省、区县、行业)1996-2020年-社科数据.zip

    全国各地级市固定资产投资统计数据集覆盖了1996至2020年的时间跨度,提供了详尽的年度固定资产投资金额,单位为百万人民币。这些数据不仅包括了地级市级别的投资情况,还涵盖了省、区县以及行业等多个维度,为研究区域经济增长、投资结构和发展趋势提供了宝贵的数据支持。固定资产投资作为衡量一个地区经济发展活力和潜力的重要指标,反映了社会固定资产在生产、投资额的规模和速度。通过这些数据,研究人员可以深入分析不同地区、不同行业的投资特点,以及随时间变化的趋势,进而为政策制定和经济预测提供科学依据。

    training_plan_db.sql

    training_plan_db.sql

    [net毕业设计]ASP.NET多语种网络硬盘系统的设计(源代码+论文).zip

    【项目资源】:包含前端、后端、移动开发、操作系统、人工智能、物联网、信息化管理、数据库、硬件开发、大数据、课程资源、音视频、网站开发等各种技术项目的源码。包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】:所有源码都经过严格测试,可以直接运行。功能在确认正常工作后才上传。【适用人群】:适用于希望学习不同技术领域的小白或进阶学习者。可作为毕设项目、课程设计、大作业、工程实训或初期项目立项。【附加价值】:项目具有较高的学习借鉴价值,也可直接拿来修改复刻。对于有一定基础或热衷于研究的人来说,可以在这些基础代码上进行修改和扩展,实现其他功能。【沟通交流】:有任何使用上的问题,欢迎随时与博主沟通,博主会及时解答。鼓励下载和使用,并欢迎大家互相学习,共同进步。

    5.html

    5

    1-全国各省地区城乡收入差距、泰尔指数、城镇农村居民可支配收入统计数据1990-2021年-社科数据.zip

    全国各省地区城乡收入差距、泰尔指数、城镇农村居民可支配收入统计数据集提供了1990至2021年间的详细数据,覆盖全国31个省份。该数据集不仅包括城镇居民和农村居民的人均可支配收入,还涵盖了乡村人口、全体居民人均可支配收入、城镇人口以及年末常住人口等关键指标。泰尔指数作为衡量收入不平等的重要工具,通过计算城镇收入与农村收入之比,为研究者提供了一个量化城乡收入差距的科学方法。这些数据不仅有助于分析中国城乡之间的经济差异,还能为政策制定者提供决策支持,以缩小城乡差距、促进区域均衡发展。数据集的丰富性使其成为社会科学领域研究城乡发展、收入分配不平等等问题的宝贵资源。

    FileName.zip

    FileName.zip

    java面向对象 - 类与对象代码.zip

    java面向对象 - 类与对象java面向对象 - 类与对象代码.zip

Global site tag (gtag.js) - Google Analytics