- 浏览: 221378 次
- 性别:
- 来自: 北京
文章分类
最新评论
-
yugouai:
下载不了啊。。。
如何获取hive建表语句 -
help:
[root@hadoop-namenode 1 5 /usr/ ...
Sqoop -
085567:
lvshuding 写道请问,sqoop 安装时不用配置什么吗 ...
Sqoop -
085567:
lvshuding 写道请问,导入数据时,kv1.txt的文件 ...
hive与hbase整合 -
lvshuding:
请问,sqoop 安装时不用配置什么吗?
Sqoop
Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase比较
- 博客分类:
- nosql
原文:
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase comparison
While SQL databases are insanely useful tools, their tyranny of ~15 years is coming to an end. And it was just time: I can’t even count the things that were forced into relational databases, but never really fitted them.
But the differences between "NoSQL" databases are much bigger than it ever was between one SQL database and another. This means that it is a bigger responsibility on software architects to choose the appropriate one for a project right at the beginning.
In this light, here is a comparison of Cassandra, Mongodb, CouchDB, Redis, Riak and HBase:
CouchDB
- Written in: Erlang
- Main point: DB consistency, ease of use
- License: Apache
- Protocol: HTTP/REST
- Bi-directional (!) replication,
- continuous or ad-hoc,
- with conflict detection,
- thus, master-master replication. (!)
- MVCC – write operations do not block reads
- Previous versions of documents are available
- Crash-only (reliable) design
- Needs compacting from time to time
- Views: embedded map/reduce
- Formatting views: lists & shows
- Server-side document validation possible
- Authentication possible
- Real-time updates via _changes (!)
- Attachment handling
- thus, CouchApps (standalone js apps)
- jQuery library included
Best used: For accumulating, occasionally changing data, on which pre-defined queries are to be run. Places where versioning is important.
For example: CRM, CMS systems. Master-master replication is an especially interesting feature, allowing easy multi-site deployments.
Redis
- Written in: C/C++
- Main point: Blazing fast
- License: BSD
- Protocol: Telnet-like
- Disk-backed in-memory database,
- but since 2.0, it can swap to disk.
- Master-slave replication
- Simple keys and values,
- but complex operations like ZREVRANGEBYSCORE
- INCR & co (good for rate limiting or statistics)
- Has sets (also union/diff/inter)
- Has lists (also a queue; blocking pop)
- Has hashes (objects of multiple fields)
- Of all these databases, only Redis does transactions (!)
- Values can be set to expire (as in a cache)
- Sorted sets (high score table, good for range queries)
- Pub/Sub and WATCH on data changes (!)
Best used: For rapidly changing data with a foreseeable database size (should fit mostly in memory).
For example: Stock prices. Analytics. Real-time data collection. Real-time communication.
MongoDB
- Written in: C++
- Main point: Retains some friendly properties of SQL. (Query, index)
- License: AGPL (Drivers: Apache)
- Protocol: Custom, binary (BSON)
- Master/slave replication
- Queries are javascript expressions
- Run arbitrary javascript functions server-side
- Better update-in-place than CouchDB
- Sharding built-in
- Uses memory mapped files for data storage
- Performance over features
- After crash, it needs to repair tables
Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. If you wanted CouchDB, but your data changes too much, filling up disks.
For example: For all things that you would do with MySQL or PostgreSQL, but having predefined columns really holds you back.
Cassandra
- Written in: Java
- Main point: Best of BigTable and Dynamo
- License: Apache
- Protocol: Custom, binary (Thrift)
- Tunable trade-offs for distribution and replication (N, R, W)
- Querying by column, range of keys
- BigTable-like features: columns, column families
- Writes are much faster than reads (!)
- Map/reduce possible with Apache Hadoop
- I admit being a bit biased against it, because of the bloat and complexity it has partly because of Java (configuration, seeing exceptions, etc)
Best used: When you write more than you read (logging). If every component of the system must be in Java. ("No one gets fired for choosing Apache’s stuff.")
For example: Banking, financial industry (though not necessarily for financial transactions, but these industries are much bigger than that.) Writes are faster than reads, so one natural niche is real time data analysis.
Riak
- Written in: Erlang & C, some Javascript
- Main point: Fault tolerance
- License: Apache
- Protocol: HTTP/REST
- Tunable trade-offs for distribution and replication (N, R, W)
- Pre- and post-commit hooks,
- for validation and security.
- Built-in full-text search
- Map/reduce in javascript or Erlang
- Comes in "open source" and "enterprise" editions
Best used: If you want something Cassandra-like (Dynamo-like), but no way you’re gonna deal with the bloat and complexity. If you need very good single-site scalability, availability and fault-tolerance, but you’re ready to pay for multi-site replication.
For example: Point-of-sales data collection. Factory control systems. Places where even seconds of downtime hurt.
HBase
(With the help of ghshephard)
- Written in: Java
- Main point: Billions of rows X millions of columns
- License: Apache
- Protocol: HTTP/REST (also Thrift)
- Modeled after BigTable
- Map/reduce with Hadoop
- Query predicate push down via server side scan and get filters
- Optimizations for real time queries
- A high performance Thrift gateway
- HTTP supports XML, Protobuf, and binary
- Cascading, hive, and pig source and sink modules
- Jruby-based (JIRB) shell
- No single point of failure
- Rolling restart for configuration changes and minor upgrades
- Random access performance is like MySQL
Best used: If you’re in love with BigTable. And when you need random, realtime read/write access to your Big Data.
For example: Facebook Messaging Database (more general example coming soon)
Of course, all systems have much more features than what’s listed here. I only wanted to list the key points that I base my decisions on. Also, development of all are very fast, so things are bound to change. I’ll do my best to keep this list updated.
– Kristof
CouchDB
语言 Erlang
特征 数据库一致性,易于使用
适合 积累性的、较少改变的数据。或者是需要文档的多版本支持(?这句话翻译不好 Places where versioning is important)
场合 CRM CMS
Redis
语言 c/c++
特征 非常快
适合 总数据集快速变化且总量可预测
场合 股票价格、实时分析、实时数据收集、实时通信
MongoDB
语言 c++
特征 与sql风格类似(查询/索引)
适合 动态查询; 索引比map/reduce方式更合适时; 需要CouchDB但数据变动更多时
场合 任何用Mysql/PostgreSQL,但是无法忍受预先定义好所有列的时候
Cassandra
语言 java
适合 写入比查询多
略 (我个人也觉得不靠谱)
Riak
语言 c++/erlang javascript
据说有开源版和企业版
特征 容错
适合 需要Cassandra式的扩展性,但是不想惹麻烦
场合 销售数据收集 … 反正坏一秒就很麻烦的场合
hbase
语言 java
类似 bigtable
适用 随机读取
发表评论
-
hbase-default.xml file seems to be for and old version 异常
2011-09-13 17:53 2686在应用java调用hbase的时候报异常: jav ... -
HBase加载大数据
2011-09-13 16:30 1140一、概述 HBase有很多种方法将数据加载到表中,最简单直接 ... -
谈正确理解 CAP 理论
2011-07-02 22:21 987转自:http://www.douban.com/group/ ... -
nosql fans 的几个博客和网站
2011-06-06 23:31 1312http://www.nosqlnotes.net/ ... -
the little mongodb book
2011-06-06 23:08 932一本新的mongodb操作手册: -
mongoDB在craigslist的应用及mysql实时导入mongodb工具
2011-05-19 22:59 1189http://www.10gen.com/video/mong ... -
反NoSQL的呼声(转)
2011-04-25 15:24 930CAP的崩溃 CAP猜想 可是NoSQL的基石。 ... -
nosql 资源(转)
2011-04-25 15:08 1058NoSQL 是非关系型数据存储的广义定义。它打破了长 ... -
论文:nosql database
2011-04-24 16:16 934很长地论文,写地不错。 -
图形化理解 HBase 数据写操作、压缩操作过程
2011-04-24 15:19 942HBase 写数据的过程是:先写到内存中(memstore), ... -
HFile存储格式
2011-03-21 19:26 850HBase中的所有数据文件都存储在Hadoop HDFS文 ... -
NoSQL数据库探讨之一 - 为什么要用非关系数据库?
2011-03-21 19:10 1060随着互联网web2.0网站的 ... -
NoSQL:Cassandra和MongoDB最受欢迎
2011-03-21 18:48 1294转自:http://cloud.csdn.net/ ... -
hbase条件查询
2011-03-18 17:25 2641一、环境 HBase版本hb ... -
HBase加载大数据
2011-03-18 17:15 999一、概述 HBase有很多种方法将数据加载到表中,最简单 ... -
MongoDB资料-MongoDB.The.Definitive.Guide
2011-03-04 09:44 957MongoDB资料-MongoDB.The.Definitiv ... -
cassandra:The.Definitive.Guide
2011-03-02 13:15 913cassandra的书籍,可以看看。 -
HBase的安装、配置、管理与编程
2011-01-10 11:39 1218环境准备需要环境:PC-1 Suse Linux 9 10. ... -
Hbase配置和开发中的几个注意事项
2011-01-10 11:23 1182在配置Hadoop和Hbase的过程中,虽然官方网站上有很多丰 ... -
hbase的org.apache.hadoop.hbase.client.RetriesExhaustedException:错误记录
2010-07-16 20:00 3727今天启动hbase后,所有hbase shell命令会出现or ...
相关推荐
NoSQL 数据库有多种类型,例如:MongoDB, Cassandra, CouchDB, Hypertable, Redis, Riak, Neo4j, HBASE, Couchbase, MemcacheDB, RevenDB, Voldemort 等。 MongoDB 与 RDBMS 的差别 MongoDB 和 RDBMS 都是免费开源...
代表产品有Cassandra、HBase、Riak等。 文档型数据库以文档形式存储数据,通常存储为JSON、XML等格式,适合于存储半结构化数据,查询效率较键值存储更高。代表产品有MongoDB、CouchDB、MongoDb(4.x)、国内开源的...
本文对八种主流 NoSQL 数据库进行了比较,包括 Cassandra、Mongodb、CouchDB、Redis、Riak、Membase、Neo4j、HBase。 CouchDB CouchDB 是一个基于 Erlang 语言开发的 NoSQL 数据库,特点是 DB 一致性、易于使用。...
Redis、MongoDB、HBase、MySQL之间的差别是什么? * 数据的表示:Redis使用Key-Value,MongoDB使用文档型,HBase使用列存储型,MySQL使用关系型。 * 查询关系:Redis使用简单的Key-Value查询,MongoDB使用文档型...
* 列存储数据库:相关产品有 Cassandra, HBase, Riak。典型应用:分布式的文件系统。数据模型:以列簇式存储,将同一列数据存在一起。优势:查找速度快,可扩展性强,更容易进行分布式扩展;劣势:功能相对局限。 * ...
- 键值存储:如Redis、Riak - 列式存储:如Cassandra、HBase - 图数据库:如Neo4j NoSQL数据库的优点包括但不限于: - 高性能:在处理大量数据时,NoSQL可以提供更高的性能。 - 高可用性:很多NoSQL数据库支持...
2. **列存储数据库**:此类数据库将相同类型的列数据存储在一起,例如 Cassandra、HBase 和 Riak 等。它们的优势在于查找速度快,易于进行分布式扩展,适合用于分布式文件系统等场景。 3. **文档型数据库**:此类...
2. **列存储数据库**:例如Cassandra、HBase、Riak等。这类数据库通常用于分布式文件系统,具有快速查找速度和强大的可扩展性。 3. **文档型数据库**:如CouchDB、MongoDB等。这些数据库适合Web应用,其数据模型由一...
云计算 ISAS PAAS SAAS 云计算好比大货轮,docker就是集装箱 Git docker: 鲸鱼背上的集装箱,彼此之间互相不影响,各自运行在各自的...Cassandra、 Mongodb、 CouchDB、 Redis、 Riak、 Membase、 Neo4j、 HBase redis
2. **列存储数据库**:这类数据库以列簇形式存储数据,代表产品包括Cassandra、HBase、Riak等。适合应用于分布式的文件系统。优点是查找速度快,易于分布式扩展,但功能相对有限。 3. **文档型数据库**:这类...
2. 列存储数据库:采用列簇式存储,典型产品包括Cassandra、HBase、Riak等。主要应用在分布式文件系统中,优势是查找速度快,可扩展性强,劣势是功能相对局限。 3. 文档型数据库:存储的数据结构是文档形式,典型...
Cassandra,CouchDB,Redis,Riak,Hbase 都是不错的选择。 Memcached 1. 什么是 Memcached? Memcached 是一个开源的,高性能的内存缓存软件。 2. Memcached 的作用是什么? 通过在事先规划好的内存空间中临时绶存...
例如:Cassandra, HBase, Riak。 3. 文档型数据库:文档型数据库的灵感是来自于Lotus Notes办公软件的,而且它同第一种键值存储相类似。该类型的数据模型是版本化的文档,半结构化的文档以特定的格式存储。例如:...
- **列存储数据库**:包括Cassandra、HBase、Riak等。这类数据库常用于分布式的文件系统中,通过列簇式存储方式将同一列的数据存放在一起,因此具有较快的查找速度和强大的可扩展性,但功能相对有限。 - **文档型...
这份报告分析了 NoSQL 领域内几种主要数据库的就业趋势,包括 Cassandra、Redis、Voldemort、SimpleDB、CouchDB、MongoDB、HBase、Hypertable 和 Riak。 Cassandra 曾经是需求量最大的 NoSQL 数据库,但在最近的...
- **列存储数据库**:如Cassandra、HBase、Riak。这类数据库适合用于分布式文件系统。 - **文档型数据库**:如CouchDB、MongoDB。这类数据库非常适合Web应用,可以突破传统关系型数据库的结构限制,提供更加灵活的...
在本文中,我们将对比分析八种常见的NoSQL数据库系统:Cassandra、MongoDB、CouchDB、Redis、Riak、Membase、Neo4j和HBase。每种数据库都有其独特的特性和适用场景。 1. **CouchDB**:基于Erlang开发,它强调数据的...
2. **列存储数据库**:例如Cassandra、HBase、Riak等。这类数据库适合用于分布式文件系统中,数据以列簇的方式存储,同一列的数据被集中存放在一起,从而提高了查询速度并增强了可扩展性。 3. **文档型数据库**:...
* 列式数据库,例如 Cassandra, HBase, Riak。 * 文档型数据库,例如 CouchDB, MongoDB。 * 图结构数据库,例如 Neo4J, InfoGrid, Infinite Graph。 MongoDB 的特点包括: * 高性能。 * 易部署。 * 易使用。 * ...