`
085567
  • 浏览: 217373 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase比较

阅读更多

原文:

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison

While SQL databases are insanely useful tools, their tyranny of ~15 years is coming to an end. And it was just time: I can’t even count the things that were forced into relational databases, but never really fitted them.

But the differences between "NoSQL" databases are much bigger than it ever was between one SQL database and another. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning.

In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis, Riak and HBase:

 

CouchDB

  • Written in: Erlang
  • Main point: DB consistency, ease of use
  • License: Apache
  • Protocol: HTTP/REST
  • Bi-directional (!) replication,
  • continuous or ad-hoc,
  • with conflict detection,
  • thus, master-master replication. (!)
  • MVCC – write operations do not block reads
  • Previous versions of documents are available
  • Crash-only (reliable) design
  • Needs compacting from time to time
  • Views: embedded map/reduce
  • Formatting views: lists & shows
  • Server-side document validation possible
  • Authentication possible
  • Real-time updates via _changes (!)
  • Attachment handling
  • thus, CouchApps (standalone js apps)
  • jQuery library included

Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.

For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.

Redis

  • Written in: C/C++
  • Main point: Blazing fast
  • License: BSD
  • Protocol: Telnet-like
  • Disk-backed in-memory database,
  • but since 2.0, it can swap to disk.
  • Master-slave replication
  • Simple keys and values,
  • but complex operations like ZREVRANGEBYSCORE
  • INCR & co (good for rate limiting or statistics)
  • Has sets (also union/diff/inter)
  • Has lists (also a queue; blocking pop)
  • Has hashes (objects of multiple fields)
  • Of all these databases, only Redis does transactions (!)
  • Values can be set to expire (as in a cache)
  • Sorted sets (high score table, good for range queries)
  • Pub/Sub and WATCH on data changes (!)

Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).

For example: Stock prices. Analytics. Real-time data collection. Real-time communication.

MongoDB

  • Written in: C++
  • Main point: Retains some friendly properties of SQL. (Query, index)
  • License: AGPL (Drivers: Apache)
  • Protocol: Custom, binary (BSON)
  • Master/slave replication
  • Queries are javascript expressions
  • Run arbitrary javascript functions server-side
  • Better update-in-place than CouchDB
  • Sharding built-in
  • Uses memory mapped files for data storage
  • Performance over features
  • After crash, it needs to repair tables

Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.

For example: For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.

Cassandra

  • Written in: Java
  • Main point: Best of BigTable and Dynamo
  • License: Apache
  • Protocol: Custom, binary (Thrift)
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Querying by column, range of keys
  • BigTable-like features: columns, column families
  • Writes are much faster than reads (!)
  • Map/reduce possible with Apache Hadoop
  • I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)

Best used: When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache’s stuff.")

For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.

Riak

  • Written in: Erlang & C, some Javascript
  • Main point: Fault tolerance
  • License: Apache
  • Protocol: HTTP/REST
  • Tunable trade-offs for distribution and replication (N, R, W)
  • Pre- and post-commit hooks,
  • for validation and security.
  • Built-in full-text search
  • Map/reduce in javascript or Erlang
  • Comes in "open source" and "enterprise" editions

Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.

For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.

HBase

(With the help of ghshephard)

  • Written in: Java
  • Main point: Billions of rows X millions of columns
  • License: Apache
  • Protocol: HTTP/REST (also Thrift)
  • Modeled after BigTable
  • Map/reduce with Hadoop
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A high performance Thrift gateway
  • HTTP supports XML, Protobuf, and binary
  • Cascading, hive, and pig source and sink modules
  • Jruby-based (JIRB) shell
  • No single point of failure
  • Rolling restart for configuration changes and minor upgrades
  • Random access performance is like MySQL

Best used: If you’re in love with BigTable. :) And when you need random, realtime read/write access to your Big Data.

For example: Facebook Messaging Database (more general example coming soon)

Of course, all systems have much more features than what’s listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. I’ll do my best to keep this list updated.

– Kristof

 

CouchDB
语言 Erlang
特征 数据库一致性,易于使用
适合 积累性的、较少改变的数据。或者是需要文档的多版本支持(?这句话翻译不好 Places where versioning is important)
场合 CRM CMS

Redis
语言 c/c++
特征 非常快
适合 总数据集快速变化且总量可预测
场合 股票价格、实时分析、实时数据收集、实时通信

MongoDB
语言 c++
特征 与sql风格类似(查询/索引)
适合 动态查询; 索引比map/reduce方式更合适时; 需要CouchDB但数据变动更多时
场合 任何用Mysql/PostgreSQL,但是无法忍受预先定义好所有列的时候

Cassandra
语言 java
适合 写入比查询多
略 (我个人也觉得不靠谱)

Riak
语言 c++/erlang javascript
据说有开源版和企业版
特征 容错
适合 需要Cassandra式的扩展性,但是不想惹麻烦
场合 销售数据收集 … 反正坏一秒就很麻烦的场合

hbase
语言 java
类似 bigtable
适用 随机读取

分享到:
评论

相关推荐

    MongoDB面试专题及答案.pdf

    NoSQL 数据库有多种类型,例如:MongoDB, Cassandra, CouchDB, Hypertable, Redis, Riak, Neo4j, HBASE, Couchbase, MemcacheDB, RevenDB, Voldemort 等。 MongoDB 与 RDBMS 的差别 MongoDB 和 RDBMS 都是免费开源...

    Redis学习笔记.pdf

    代表产品有Cassandra、HBase、Riak等。 文档型数据库以文档形式存储数据,通常存储为JSON、XML等格式,适合于存储半结构化数据,查询效率较键值存储更高。代表产品有MongoDB、CouchDB、MongoDb(4.x)、国内开源的...

    MongoDB 24 道面试题及答案.docx

    Redis、MongoDB、HBase、MySQL之间的差别是什么? * 数据的表示:Redis使用Key-Value,MongoDB使用文档型,HBase使用列存储型,MySQL使用关系型。 * 查询关系:Redis使用简单的Key-Value查询,MongoDB使用文档型...

    Redis心得笔记.docx

    * 列存储数据库:相关产品有 Cassandra, HBase, Riak。典型应用:分布式的文件系统。数据模型:以列簇式存储,将同一列数据存在一起。优势:查找速度快,可扩展性强,更容易进行分布式扩展;劣势:功能相对局限。 * ...

    MongoDB面试专题.pdf

    - 键值存储:如Redis、Riak - 列式存储:如Cassandra、HBase - 图数据库:如Neo4j NoSQL数据库的优点包括但不限于: - 高性能:在处理大量数据时,NoSQL可以提供更高的性能。 - 高可用性:很多NoSQL数据库支持...

    redis笔记.docx

    2. **列存储数据库**:此类数据库将相同类型的列数据存储在一起,例如 Cassandra、HBase 和 Riak 等。它们的优势在于查找速度快,易于进行分布式扩展,适合用于分布式文件系统等场景。 3. **文档型数据库**:此类...

    redis数据库

    2. **列存储数据库**:例如Cassandra、HBase、Riak等。这类数据库通常用于分布式文件系统,具有快速查找速度和强大的可扩展性。 3. **文档型数据库**:如CouchDB、MongoDB等。这些数据库适合Web应用,其数据模型由一...

    互联网常用词 集合

    云计算 ISAS PAAS SAAS 云计算好比大货轮,docker就是集装箱 Git docker: 鲸鱼背上的集装箱,彼此之间互相不影响,各自运行在各自的...Cassandra、 Mongodb、 CouchDB、 Redis、 Riak、 Membase、 Neo4j、 HBase redis

    redis教案笔记

    2. **列存储数据库**:这类数据库以列簇形式存储数据,代表产品包括Cassandra、HBase、Riak等。适合应用于分布式的文件系统。优点是查找速度快,易于分布式扩展,但功能相对有限。 3. **文档型数据库**:这类...

    redis学习教案

    2. 列存储数据库:采用列簇式存储,典型产品包括Cassandra、HBase、Riak等。主要应用在分布式文件系统中,优势是查找速度快,可扩展性强,劣势是功能相对局限。 3. 文档型数据库:存储的数据结构是文档形式,典型...

    MongoDB 43 道面试题及答案.docx

    Cassandra,CouchDB,Redis,Riak,Hbase 都是不错的选择。 Memcached 1. 什么是 Memcached? Memcached 是一个开源的,高性能的内存缓存软件。 2. Memcached 的作用是什么? 通过在事先规划好的内存空间中临时绶存...

    Redis安装与配置文档

    例如:Cassandra, HBase, Riak。 3. 文档型数据库:文档型数据库的灵感是来自于Lotus Notes办公软件的,而且它同第一种键值存储相类似。该类型的数据模型是版本化的文档,半结构化的文档以特定的格式存储。例如:...

    Hbase架构简介、实践

    - **列存储数据库**:包括Cassandra、HBase、Riak等。这类数据库常用于分布式的文件系统中,通过列簇式存储方式将同一列的数据存放在一起,因此具有较快的查找速度和强大的可扩展性,但功能相对有限。 - **文档型...

    NoSQL就业形势分析1

    这份报告分析了 NoSQL 领域内几种主要数据库的就业趋势,包括 Cassandra、Redis、Voldemort、SimpleDB、CouchDB、MongoDB、HBase、Hypertable 和 Riak。 Cassandra 曾经是需求量最大的 NoSQL 数据库,但在最近的...

    Redis的使用

    - **列存储数据库**:如Cassandra、HBase、Riak。这类数据库适合用于分布式文件系统。 - **文档型数据库**:如CouchDB、MongoDB。这类数据库非常适合Web应用,可以突破传统关系型数据库的结构限制,提供更加灵活的...

    8种Nosql数据库系统对比

    在本文中,我们将对比分析八种常见的NoSQL数据库系统:Cassandra、MongoDB、CouchDB、Redis、Riak、Membase、Neo4j和HBase。每种数据库都有其独特的特性和适用场景。 1. **CouchDB**:基于Erlang开发,它强调数据的...

    redis详细笔记

    2. **列存储数据库**:例如Cassandra、HBase、Riak等。这类数据库适合用于分布式文件系统中,数据以列簇的方式存储,同一列的数据被集中存放在一起,从而提高了查询速度并增强了可扩展性。 3. **文档型数据库**:...

    深入学习MongoDB

    * 列式数据库,例如 Cassandra, HBase, Riak。 * 文档型数据库,例如 CouchDB, MongoDB。 * 图结构数据库,例如 Neo4J, InfoGrid, Infinite Graph。 MongoDB 的特点包括: * 高性能。 * 易部署。 * 易使用。 * ...

Global site tag (gtag.js) - Google Analytics