`
idning
  • 浏览: 138440 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

my CouchDB tutorial

阅读更多



历史


    “Couch” 是 “Cluster Of Unreliable Commodity Hardware” 的首字母缩写,它反映了 CouchDB 的目标具有高度可伸缩性,提供了高可用性和高可靠性,即使运行在容易出现故障的硬件上也是如此。CouchDB 最初是用 C++ 编写的,但在 2008 年 4 月,这个项目转移到 Erlang OTP 平台进行容错测试。

    在这篇访谈中,Katz谈 到,CouchDB其实是将Lotus Notes的核心剥离出来,去芜存菁的产物。

    IBM曾经资助CouchDB,允许Katz全职从事项目的开发。

    2009年,Katz和Chris Anderson等 一些同仁组建Relaxed公司,同年11月公司获得200万美元风险投资, 改名为Couchio。




特性:

  • NOSQL, 面向文档的数据库
  • 追加型数据库
  • 无中心???

  • 多 版本并发性控制(Multiversion concurrency controlMVCC)
    • -它向每个客户机提供数据库的 最新版本的快照。这意味着在提交事务之前,其他用户不能看到更改。许多现代数据库开始从锁机制前移到 MVCC,包括 Oracle(V7 之后)和 Microsoft® SQL Server 2005 及更新版本。
  • HTTP接口,JSON API 访问
  • 强 大的 B-树储存引擎
  • Map/Reduce

    PS. 上面描述的这些正处在迅速的进化之中……包括身份验证,同步过滤,URL Mapping 等等,所有需要用到的一切,正在迅速被增加进来。 (已经呼之欲出 —— CouchDB 将会进化成为一个 AppServer)
我不喜欢这点,觉得应该做成一个有 Map Reduce框架的数据库,而不是一个AppServer(CouchApp) -Lin Yang 7/23/10 7:57 PM 

Examples


获 取CouchDB server info:

curl http://127.0.0.1:5984/

 

{"couchdb":"Welcome","version":"0.11.0"}

创 建DB:

curl -X PUT http://127.0.0.1:5984/wiki

    CouchDB will reply with the following message, if the database does not exist:

{"ok":true}

    or, with a different response message, if the database already exists:

{"error":"file_exists","reason":"The database could not be created, the file already exists."}

获取DB 信息:

curl -X GET http://127.0.0.1:5984/wiki

 

{"db_name":"wiki","doc_count":0,"doc_del_count":0,"update_seq":0,
"purge_seq":0,"compact_running":false,"disk_size":79,
"instance_start_time":"1272453873691070","disk_format_version":5}

删 除DB:

curl -X DELETE http://127.0.0.1:5984/wiki

 

{"ok":true}

在wiki下 创建一个称为 apple 的文档
curl -X PUT http://127.0.0.1:5984/wiki/apple -H "Content-Type: application/json" -d {}
 
{"ok":true,"id":"apple","rev":"1801185866"}

获取文档:
curl -X GET http://127.0.0.1:5984/wiki/apple
{"_id":"apple","_rev":"1801185866"}


update 文档:
curl -X PUT http://localhost:5984/wiki/apple -H "Content-Type: application/json" -d '{"_rev":"1801185866" ,"a":3}'
{"ok":true,"id":"apple","rev":"2-b5be0b773091"}

典 型的 CouchDB View (query)

map: function(doc) {
    if (doc._attachments) {
         emit("with attachment", 1);
     }
     else {
         emit("without attachment", 1);
     }
}
reduce: function(keys, values) {
    return sum(values);
}

curl -s -i -X POST -H 'Content-Type: application/json'
-d '{"map": "function(doc){if(doc._attachments) {emit(\"with\",1);} else {emit(\"without\",1);}}",
"reduce": "function(keys, values) {return sum(values);}"}'
'http://localhost:5984/somedb/_temp_view?group=true'



Futon界面:


http://localhost:5984/_utils/


关于View /Map Reduce:


the view is defined by a JavaScript function that maps view keys to values

两种View

  • permanent view
    • stored inside special documents called design documents
    • create: create doc: http://localhost:5984/{dbname}/_design/my_views
    • query: GET URI /{dbname}/{docid}/{viewname}
    •                                      _design/my_views/_view/all_docs

  • temporary view
    • Slow , very expensive to compute
    • POST URI /{dbname}/_temp_view

创建view :

在Futon界面上,Overview > wiki
右上角选择 view : Temporary View.
(If your Futon Web-Client acts funny, clear the cookies futon created )

Map & Reduce

map
function(doc) {
    emit(null, doc);
}


    A view function should accept a single argument: the document object. To produce results, it should call the implicitly available emit(key, value) function.For every invocation of that function, a result row is added to the view

    To be able to filter or sort the view by some document property, you would use that property for the key. For example, the following view would allow you to lookup customer documents by the LastName or FirstName fields:

 

function(doc) {
    if (doc.Type == "customer") {
        emit(doc.LastName, {FirstName: doc.FirstName, Address: doc.Address});
        emit(doc.FirstName, {LastName: doc.LastName, Address: doc.Address});
    }
}
reduce
function (key, values, rereduce) {
    return sum(values);
}


Reduce functions must handle two cases:

1. When rereduce is false:

  • key will be an array whose elements are arrays of the form [key,id], where key is a key emitted by the map function and id is that of the document from which the key was generated.

  • values will be an array of the values emitted for the respective elements in keys

  • i.e. reduce([ [key1,id1], [key2,id2], [key3,id3] ], [value1,value2,value3], false)

2. When rereduce is true:

  • key will be null

  • values will be an array of values returned by previous calls to the reduce function

  • i.e. reduce(null, [intermediate1,intermediate2,intermediate3], true)


注意:

  • 一个View 可以只有map函数
  • A reduce function must reduce the input values to a smaller output value    (reduce 函数要处理emit的结果,也要处理自己返回的结果)
  • emit发射的key可以是一个数组格式:[X,Y,1]

Reduce Vs reReduce

据说,比如一个map函数产生如下 Key->Value pair:
[X, Y, 0] -> Object_A
[X, Y, 1] -> Object_B
[X, Y, 2] -> Object_C
[X, Y, 3] -> Object_D
然后,reduce函数会受到如下3个调用. ()
reduce( [  [[X,Y,0], id0] , [[X,Y,1], id1]  ], [Object_A, Object_B], false)
reduce( [  [[X,Y,2], id3] , [[X,Y,3], id3]  ], [Object_C, Object_D], false)
reduce( null                                 , [Object_CB, Object_CD], true)


    我还是不懂得这个ReReduce是什么用处,因为我觉得,对于每个key,会来一次reduce.
之后不会再有第二次reduce了。

group

group=true可 以让 Reduce 方法按照 Map 方法输出的键进行分组

效果见示例代码.


Debugging Views

有个log函数可以用于输出debug信息:
{
     "map": "function(doc) { log(doc); }"
}
tail -f /var/log/couchdb/couch.log


View相关的主要文档(我觉得这三个文档写得真难 懂...)

这 里有个不错的在线演示: 只是不错~不是最好. -Lin Yang 7/23/10 7:47 PM 
http://labs.mudynamics.com/wp-content/uploads/2009/04/icouch.html



安装:

ubuntu

源里面有.

编译安装:

依赖真多...
yum install js-devel icu libicu libicu-devel
wget http://curl.haxx.se/download/curl-7.21.0.tar.gz && tar vzxf curl-7.21.0.tar.gz && cd curl-7.21.0 && ./configure && make  && make install


#/usr/local/bin/couchdb -b (background)
Apache CouchDB has started, time to relax.

果 yum里面没有 ... 安装SpiderMonkey

Erlang Client

http接口,不需要专用client,但是有client更 好...


couchbeam

eCouch

 

erlcouch ..


 



benchmark

某人做的:
CouchDB inserts ~2-3k documents / second in a >100k documents database
-------- 0.3-0.4ms / doc
CouchDB inserts get slower on bigger databases

在 我的8核8G上(4G free)
ab -n 10000 -c 100 http://localhost:5984/wiki/apple   (查询请求)
Server Software:        CouchDB/0.11.0
Server Hostname:        localhost
Document Length:        118 bytes
Requests per second:    3066.17 [#/sec] (mean)
Time per request:       32.614 [ms] (mean)
Time per request:       0.326 [ms] (mean, across all concurrent requests)





好的文档:


http://wiki.apache.org/couchdb/Getting_started_with_Erlang
http://en.wikipedia.org/wiki/CouchDB  有例子
http://www.ibm.com/developerworks/cn/opensource/os-couchdb/  介绍+例子
http://www.ibm.com/developerworks/cn/opensource/os-cn-couchdb/index.html  (长,ready)

CouchDB: The Definitive Guide 的翻译blog  by  时之刻痕

http://wiki.apache.org/couchdb/FrontPage  总文档!!!

clients:
http://wiki.apache.org/couchdb/Getting_started_with_Python  python示例代码
http://wiki.apache.org/couchdb/API_Cheatsheet  API
http://news.csdn.net/a/20100714/219109.html

书:

http://books.couchdb.org/relax/

 

副自己的测试代码:

 

#!/usr/bin/python
#coding:utf-8

# example:
# do_request('10.99.60.91:8080', '/home', 'PUT', '', {"Content-type": "application/x-www-form-urlencoded"} )
def do_request2(netloc, path, method, data='', headers={}):
    import httplib
    conn = httplib.HTTPConnection(netloc)
    conn.request(method, path, data, headers)
    response = conn.getresponse()
    if response.status/100 == 2:
        data = response.read()
        return data
    print 'ERROR: response.status = %d'% response.status
    print 'response data is ', response.read()

def do_request(url, method, data='', headers={}):
    print '>>>>>>>>>>>>> do_request: ', url
    from urlparse import urlparse 
    o = urlparse(url)
    path = o.path
    if o.query:
        path = path + '?' + o.query
    return do_request2(o.netloc, path, method, data, headers)


def delete_db(db_name):
    return do_request('http://127.0.0.1:5984/%s'%db_name, 'DELETE')

def create_db(db_name):
    return do_request('http://127.0.0.1:5984/%s'%db_name, 'PUT')

def create_doc(db_name, doc_id, doc):
    print 'create_doc %s ' % doc_id
    return do_request('http://127.0.0.1:5984/%s/%s'%(db_name, doc_id), 'PUT', doc, {'Content-Type': 'application/json'})

def get_doc(db_name, doc_id):
    return do_request('http://127.0.0.1:5984/%s/%s'%(db_name, doc_id), 'GET', '', {})
#query_string='?group=false'
def query_temp_view(db_name, doc, query_string=''):
    url = 'http://127.0.0.1:5984/%s/_temp_view%s'%(db_name, query_string)
    return do_request(url, 'POST', doc, {'Content-Type': 'application/json; charset=UTF-8'})

# 这个API是创建好多个view,一个design document
def create_permanent_view(db_name, view_name, views_json):
    return create_doc(db_name, view_name, views_json) 



def test():
    print delete_db('phone')
    print create_db('phone')
    print create_doc('phone', 'Nokia-5200','''
        {"make": "Nokia", 
        "price": 100, 
        "os": "s40"}
            ''')

    print create_doc('phone', 'Nokia-1661','''
        {"make": "Nokia", 
        "price": 32.5, 
        "os": "s40"}
            ''')

    print create_doc('phone', 'Nokia-E63','''
        {"make": "Nokia", 
        "price": 500, 
        "os": "s60"}
            ''')

    print create_doc('phone', 'HTC-Wildfire','''
        {"make": "HTC", 
        "price": 200, 
        "os": "Android"}
            ''')

    print create_doc('phone', 'BlackBerry-Bold','''
        {"make": "BlackBerry", 
        "price": 300, 
        "os": "BlackBerry-OS"}
            ''')

    print create_doc('phone', 'Samsung-Galaxy-S','''
        {"make": "Samsung", 
        "price": 400, 
        "os": "Android"}
            ''')
    print create_doc('phone', 'iPhone4','''
        {"make": "Apple", 
        "price": 1000, 
        "os": "Mac"}
            ''')
    ############################################################################
    # get all docs
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.price, doc);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    # select sum(price) form phone 
    view = ''' 
{
    "map" : "function(doc){
        emit('all-price', doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        return sum(values);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    # test permanent_view
    views = '''
{
    "language": "javascript",
    "views": {
        "all_phones": {
            "map" : "function(doc){
                emit(doc.price, doc);
            }"
        },
        "sum_price": {
            "map" : "function(doc){
                emit('all-price', doc.price);
            }",
            "reduce" : "function(key, values, rereduce){
                return sum(values);
            }"
        },
    }
}
    '''
    create_permanent_view('phone', '_design/my_views', views)
    print ':::: retrive the views DOC: '
    print get_doc('phone', '_design/my_views')
    print ':::: now let us try to query on this views "all_phones"'
    print get_doc('phone', '_design/my_views/_view/all_phones')

    print ':::: And query on this views "sum_price"'
    print get_doc('phone', '_design/my_views/_view/sum_price')

    ############################################################################
    print ':::: we use temp view for test'
    print ':::: get all phones of Nokia: '
    view = ''' 
{
    "map" : "function(doc){
        //log(doc); // debug fun
        if (doc.make == 'Nokia')
            emit(null, doc);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    print ':::: get phones count of every os : '
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.os, 1);
    }",
    "reduce" : "function(key, values, rereduce){
        log('reduce called!!!!');
        log(key);
        log(values);
        log(rereduce);
        return sum(values); 
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');

    print ':::: let us look at the parm group.. if we set group=false : '
    print query_temp_view('phone', view, '?group=false');

    ############################################################################
    print ':::: let us get a list of unique os , just like SQL: SELECT DISTINCT(os) FROM phone'
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.os, null);
    }",
    "reduce" : "function(key, values, rereduce){
        return null;
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');
   

    ############################################################################
    print ':::: let us get all phone sort by price || SELECT _id FROM phone SORT BY price'
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.price, doc._id);
    }",
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    print ':::: let us get min price || SELECT min(price) FROM phone '
    view = ''' 
{
    "map" : "function(doc){
        emit('p', doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        return Math.min.apply( Math, values);
            //http://labs.mudynamics.com/wp-content/uploads/2009/04/icouch.html 上 
            //computing min width/height ( js模拟)的例子在我这里不行
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');



    ############################################################################
    print ':::: I hava to try emit a array like emit([a,b,c], value)'

    view = ''' 
{
    "map" : "function(doc){
        emit(['p', 'min'], doc.price);
        emit(['p', 'max'], doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        log('reduce called!!!!');
        log(key);
        log(values);
        log(rereduce);

        return Math.min.apply( Math, values);
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true&group-level=2');

    print ':::: if gropu-level==1 '
    print ':::: [p, min] and [p, max] will come together to a reduce fun, like this'
    print ':::: [[["p","max"],"BlackBerry-Bold"],[["p","max"],"HTC-Wildfire"],[["p","max"],"iPhone4"],[["p","max"],"Nokia-1661"],[["p","max"],"Nokia-5200"],[["p","max"],"Nokia-E63"],[["p","max"],"Samsung-Galaxy-S"],[["p","min"],"BlackBerry-Bold"],[["p","min"],"HTC-Wildfire"],[["p","min"],"iPhone4"],[["p","min"],"Nokia-1661"],[["p","min"],"Nokia-5200"],[["p","min"],"Nokia-E63"],[["p","min"],"Samsung-Galaxy-S"]]'

    print ':::: if gropu-level==2 '
    print ':::: [p, min] and [p, max] will come separate.......... like this'
    print '[[["p","min"],"Samsung-Galaxy-S"],[["p","min"],"Nokia-E63"],[["p","min"],"Nokia-5200"],[["p","min"],"Nokia-1661"],[["p","min"],"iPhone4"],[["p","min"],"HTC-Wildfire"],[["p","min"],"BlackBerry-Bold"]]' 

    #TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTODO ############################################################################
    print ':::: let us retrive top N os'









if __name__ == "__main__":
    test()
 
分享到:
评论
1 楼 idning 2010-07-23  
前两天CouchDB发布了1.0版本,据说对于大文档的插入速度提高了300%...

相关推荐

    Beginning CouchDB.pdf

    ### CouchDB基础知识与应用 #### 一、CouchDB简介 CouchDB是Apache基金会旗下的一个开源NoSQL数据库系统,其设计目标是为了更好地支持Web 2.0时代的数据存储需求。该数据库以其易于使用、可扩展性和高可用性著称,...

    apache-couchdb-2.3.1.zip

    Apache CouchDB是一个开源的、基于文档的分布式数据库系统,它采用了JSON作为数据格式,并使用JavaScript进行查询和数据处理。在2.3.1版本中,CouchDB继续提供了一流的可扩展性和灵活性,适用于各种应用程序,特别是...

    CouchDB20 分钟入门

    学习couchDB 的入门教程

    couchdb源码

    CouchDB是一款开源的文档数据库管理系统,以其独特的JSON数据模型、RESTful API和分布式系统设计而闻名。在深入探讨CouchDB源码之前,我们首先理解CouchDB的基本概念和工作原理。 CouchDB的核心是基于JSON...

    CouchDB权威指南(中文 完整版)

    根据提供的文件信息,我们可以推断出这是一份关于CouchDB的权威指南,该文档为中文版,并且是完整的两百多页版本。下面将基于这些信息生成相关的CouchDB知识点。 ### CouchDB简介 CouchDB是一款面向文档、分布式且...

    Apress.Beginning.CouchDB.Dec.2009.pdf

    《初识CouchDB》是一本面向专业人士的专业书籍,它详细介绍了Apache CouchDB数据库管理系统的基础知识和高级特性。CouchDB作为一个不断发展的非关系型数据库系统,与传统的SQL数据库相比,具有独特的特性和优势。...

    CouchDB权威指南(带详细目录)PDF

    通过《CouchDB权威指南》,你将学会如何通过CouchDB的RESTful Web接口来使用它,此外你还会熟悉CouchDB的一些主要特性,比如简单的文档的CRUD(创建、读取、更新、删除); 高级的MapReduce,部署优化等更多的内容...

    CouchDB独立博客sofa-CouchDB.zip

    sofa-CouchDB 是 CouchDB 的独立博客,使用 CouchDB 的书来做主要内容,这方便了所有用来在这博客上交流他们的想法,并且里面提供了很多帮助指导,这都是 HTML,Javascript 和 CouchDB 的结晶。目前支持任何人在上面...

    CouchDB资料整理

    CouchDB是一款开源的数据库系统,属于Apache软件基金会的一个项目。它是一种NoSQL数据库,以文件存储形式使用JSON作为数据存储格式,并采用JavaScript作为查询语言。CouchDB具有灵活的API,支持MapReduce和HTTP等...

    CouchDB,Python

    CouchDB是一款开源的、基于文档的分布式数据库系统,它以JSON格式存储数据,并使用JavaScript作为查询语言。Python中的CouchDB模块是Python与CouchDB服务器交互的接口,允许开发者使用Python编写代码来操作CouchDB...

    Fabric 1.4基于couchdb环境搭建

    Fabric 1.4基于couchdb环境搭建步骤,以及基于couchdb的区块链多字段数据查询

    Apache-CouchDB.zip

    CouchDB 是一个开源的面向文档的数据库管理系统,可以通过 RESTful JavaScript Object Notation (JSON) API 访问。术语 “Couch” 是 “Cluster Of Unreliable Commodity Hardware” 的首字母缩写,它反映了 CouchDB...

    数据库CouchDB入门到精通.txt打包整理.zip

    CouchDB是一款开源的、基于文档的分布式数据库系统,它采用了JSON作为数据格式,JavaScript作为查询语言,并且支持多版本并发控制。这个压缩包“数据库CouchDB入门到精通.txt打包整理.zip”显然包含了关于CouchDB的...

    CouchDB权威指南

    《CouchDB权威指南》是一本深入探讨CouchDB数据库系统的专著,旨在为读者提供全面、详尽的CouchDB知识。CouchDB是一种基于文档的分布式数据库系统,采用JSON作为数据格式,JavaScript作为查询语言,并且支持多版本...

    apache-couchdb-2.3.1.msi

    CouchDB 是一个开源的面向文档的数据库管理系统,可以通过 RESTful JavaScript Object Notation (JSON) API 访问。术语 “Couch” 是 “Cluster Of Unreliable Commodity Hardware” 的首字母缩写,它反映了 CouchDB...

    awesome-couchdb, CouchDB精选元资源&最佳实践列表.zip

    awesome-couchdb, CouchDB精选元资源&最佳实践列表 出色的CouchDB 面向CouchDB的curated元资源&最佳实践列表。是一个面向文档的面向服务的数据库,它同步。欢迎请求请求。电子邮件内容模式&最佳实践。Map/Reduce连接...

Global site tag (gtag.js) - Google Analytics