论坛首页 编程语言技术论坛

[翻译]Berkeley DB 文档 - C++入门篇 - 1.3节 - 访问方式(Access Methods)

浏览 6719 次
精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者 正文
   发表时间:2007-05-11  
C++
[翻译]Berkeley DB 文档 - C++入门篇 - 1.3节 - 访问方式(Access Methods)

Access Methods
访问方式

While this manual will focus primarily on the BTree access method, it is still useful to briefly describe all of the access methods that DB makes available.
本手册先来关注一下B树的访问方式,这对概述DB其他可用的方式也有帮助.

Note that an access method can be selected only when the database is created. Once selected, actual API usage is generally identical across all access methods. That is, while some exceptions exist, mechanically you interact with the library in the same way regardless of which access method you have selected.
注意,只能在数据库创建时设定访问的方式.一旦选定,所有的访问方式的API的使用实际上基本一致的.也就是说,虽然有一些例外的存在,但基本上你可以照本宣章的通过同样的途径控制数据库而不用操心你选用的是何种方式.

The access method that you should choose is gated first by what you want to use as a key, and then secondly by the performance that you see for a given access method.
你选择访问的方式首先要考虑的是用什么来做键,然后根据性能选择合适的访问方式.

The following are the available access methods:
下面是可用的访问方式:

BTree    
B树

Data is stored in a sorted, balanced tree structure. Both the key and the data for BTree records can be arbitrarily complex. That is, they can contain single values such as an integer or a string, or complex types such as a structure. Also, although not the default behavior, it is possible for two records to use keys that compare as equals. When this occurs, the records are considered to be duplicates of one another.
数据被保存到有序平衡树中.键和值都可以是任意复杂的.也就是说,他们(键和值)可以是单一的类型比如整数或字串,也可以是复杂的类型比如一个结构体.另外,尽管不是默认的行为,拥有两个键等同的记录也是可以的.这种情况下,这些记录被认为一个是另一个的副本.

Hash    
哈希

Data is stored in an extended linear hash table. Like BTree, the key and the data used for Hash records can be of arbitrarily complex data. Also, like BTree, duplicate records are optionally supported.
数据被保存在一个扩展的线性哈希表中.和B树类似,键和值可以是任意复杂的.另外,和B树类似,多重记录也是可选的.

Queue
队列

Data is stored in a queue as fixed-length records. Each record uses a logical record number as its key. This access method is designed for fast inserts at the tail of the queue, and it has a special operation that deletes and returns a record from the head of the queue.
数据被保存在一个作为定长记录集的队列中.每个记录使用一个逻辑记录号作为键.这种访问方式被用在需要在队列尾巴快速插入数据的场合,它还有特殊的操作来返回和删除头部的数据.

This access method is unusual in that it provides record level locking. This can provide beneficial performance improvements in applications requiring concurrent access to the queue.
这种访问方式与众不同处在于他提供记录级别的锁.当程序需要并发的访问队列时,这一点可以提高性能.

Recno
Recno

Data is stored in either fixed or variable-length records. Like Queue, Recno records use logical record numbers as keys.
数据被保存在一个定长/变长记录集中.类似Queueh,Recno使用逻辑记录号作为键.

Selecting Access Methods
选择访问方式

To select an access method, you should first consider what you want to use as a key for you database records. If you want to use arbitrary data (even strings), then you should use either BTree or Hash. If you want to use logical record numbers (essentially integers) then you should use Queue or Recno.
选择一种访问方式你首先需要您的数据库记录集键的类型.如果你想使用任意类型的数据(甚至仅是字串),你就应该用B树或哈希.如果你想使用逻辑记录号(本质上说是整数),那么使用队列或是Recon吧.

Once you have made this decision, you must choose between either BTree or Hash, or Queue or Recno. This decision is described next.
一旦你做出以上的决定,你就要在B树,哈希,队列,Recno中选取了.下面来对比描述一下.

Choosing between BTree and Hash
在B树和哈希中选取

For small working datasets that fit entirely in memory, there is no difference between BTree and Hash. Both will perform just as well as the other. In this situation, you might just as well use BTree, if for no other reason than the majority of DB applications use BTree.
对于小到应该被完全加载到内存的工作数据集,B树和哈希没有什么区别.他们彼此的表现几乎一样优秀.这种情况下你或许应该选择B树,如果没有特殊的原因,大半的DB应用使用B树.

Note that the main concern here is your working dataset, not your entire dataset. Many applications maintain large amounts of information but only need to access some small portion of that data with any frequency. So what you want to consider is the data that you will routinely use, not the sum total of all the data managed by your application.
注意,这里主要关心的是你的工作数据集,不是你的整个数据集.许多应用维护着大量信息但是在任何情况下只需要访问其中的一小部分.如此你就需要考虑你通常使用的数据,而不是你的应用程序使用的所有数据.

However, as your working dataset grows to the point where you cannot fit it all into memory, then you need to take more care when choosing your access method. Specifically, choose:
然而,当你的工作数据集增长到不能全部加载到内存的临界点时,你就需要注意选择访问方式了,特别是,选择:

    *    BTree if your keys have some locality of reference. That is, if they sort well and you can expect that a query for a given key will likely be followed by a query for one of its neighbors.
    *    B树:如果你的键有位置上的关联.也就是说,如果他们排序良好那么你可以预期一个给定键的查询很可能在查询它的一个邻居之后.

    *    Hash if your dataset is extremely large. For any given access method, DB must maintain a certain amount of internal information. However, the amount of information that DB must maintain for BTree is much greater than for Hash. The result is that as your dataset grows, this internal information can dominate the cache to the point where there is relatively little space left for application data. As a result, BTree can be forced to perform disk I/O much more frequently than would Hash given the same amount of data.
    *    哈希:如果你的数据集非常巨大.所有的给出访问方式,DB都要维护一个确定大小额外信息.然而,B树比哈希所需要的额外信息多得多.结果就是当你的数据集增长时,额外信息可能装满了缓存相对的应用程序可用的空间就小了.影响就是,B树不是不比哈希更频繁的访问相同数量的数据.

    Moreover, if your dataset becomes so large that DB will almost certainly have to perform disk I/O to satisfy a random request, then Hash will definitely out perform BTree because it has fewer internal records to search through than does BTree.
    更重要的是如果你的数据集庞大到DB几乎肯定要执行磁盘I/O操作来满足一个随机的访问,Hash毫无疑问的会胜出B树因为它内部搜索的记录更少.

Choosing between Queue and Recno
在队列和Recno中选取

Queue or Recno are used when the application wants to use logical record numbers for the primary database key. Logical record numbers are essentially integers that uniquely identify the database record. They can be either mutable or fixed, where a mutable record number is one that might change as database records are stored or deleted. Fixed logical record numbers never change regardless of what database operations are performed.
队列和Recno被用在当程序需要使用逻辑记录号作为主记录的键时.逻辑记录号本质上是唯一标识记录的整数。可以是变化的也可以是固定的,可变的记录号可以在记录数据数据保存或删除时被改变.固定的逻辑记录号无论数据库如何运行也不会改变.

When deciding between Queue and Recno, choose:
当在队列和Recno中取舍时:

    *      Queue if your application requires high degrees of concurrency. Queue provides record-level locking (as opposed to the page-level locking that the other access methods use), and this can result in significantly faster throughput for highly concurrent applications.
    *    队列:如果你的应用需要高度并发.队列提供了记录级(record-level)的锁(相对是另一种访问方式的页级(page-level)锁),这在并发中性能有着显著优势.
   
    Note, however, that Queue provides support only for fixed length records. So if the size of the data that you want to store varies widely from record to record, you should probably choose an access method other than
    Queue.
    注意,然而,队列只提供对定长记录的支持.那么如果你的每条数据大小差异显著,你可能需要队列之外的一种访问方式.
   
    *      Recno if you want mutable record numbers. Queue is only capable of providing fixed record numbers. Also, Recno provides support for databases whose permanent storage is a flat text file. This is useful for applications looking for fast, temporary storage while the data is being read or modified.
    *    Recno:如果你需要可变的记录号.队列只能提供固定的记录号.同时,Recno支持将数据库存储在一个文本文件.这对需要快速,临时存储的正在读取和修改数据的应用很有用.
论坛首页 编程语言技术版

跳转论坛:
Global site tag (gtag.js) - Google Analytics