前言:
关系型数据库已经红火了很久,但是其弊端也是显而易见的,对于很多非结构数据以及半结构化数据很难有效地管理,而且RDBMS的固定式的Schema往往很难接受,太呆板不灵活,因此基于可自由伸缩的schema的数据库随之而来了,这个就是文档数据库,伴随着云计算技术的发展,支持MapReduce以及多点复制、反向搜索引擎技术的文档数据库正在渐渐地成为了主流,其中的开源娇娇者有 Hadoop, CouchDB, MongoDB等众多的数据库,不过各个数据都有自己的特点。
Wikipedia论述:
A document-oriented database is a
computer program designed for document-oriented applications. These systems may be implemented as a layer above arelational database or an
object database.
For example here's a document:
FirstName="Bob", Address="5 Oak St.", Hobby="sailing".
Another document could be:
FirstName="Jonathan", Address="15 Wanamassa Point Road", Children=[{Name:"Michael",Age:10}, {Name:"Jennifer", Age:8}, {Name:"Samantha", Age:5}, {Name:"Elena", Age:2}].
Notice that both documents have some similar information and some different - but unlike a relational database where each record would have the same set of fields and unused fields might be kept empty, there are no empty 'fields' in either document (record)
in this case. This system allows new information to be added and it doesn't require explicitly stating if other pieces of information are left out, as in relational databases.
It is noteworthy here that using XML,
YAML or JSON for information storage has advantages similar to document oriented database. In these languages each record can have a non-standard amount of information. Such information is properly calledsemi
structured data.
Another advantage of document oriented databases is the ease of usage and programming so that untrained business users, for example, can create applications and design their own databases. Information can be added without worrying about the "record size"
and so programmers simply need to build an interface to allow the information to be entered easily.
Implementations
Name
Publisher
License
Language
Notes
RESTful API
Lotus Notes |
IBM |
Proprietary |
|
|
(unknown) |
askSam |
askSam Systems |
Proprietary |
|
|
(unknown) |
Apstrata |
Apstrata |
Proprietary |
|
|
(unknown) |
Datawasp |
Significant Data Systems |
Proprietary |
|
|
(unknown) |
CRX |
Day Software |
Proprietary |
|
|
(unknown) |
MUMPS Database[1]
|
|
Proprietary and GNU Affero GPL[2]
|
MUMPS |
Commonly used in health applications. |
(unknown) |
UniVerse |
Rocket Software |
Proprietary |
|
|
Yes (Beta) |
UniData |
Rocket Software |
Proprietary |
|
|
Yes (Beta) |
Jackrabbit |
Apache |
Apache License |
Java |
|
(unknown) |
CouchDB |
Apache |
Apache License |
Erlang |
JSON over HTTP |
Yes |
FleetDB |
FleetDB |
MIT License |
Clojure |
A
JSON-based schema-free database optimized for agile development. |
(unknown) |
MongoDB |
|
GNU AGPL v3.0[3]
|
C++ |
Fast, document-oriented database optimized for highly transient data. |
(unknown) |
GemFire Enterprise[2]
|
VMWare |
Commercial |
Java, .NET, C++
|
Memory-oriented, fast, key-value database with indexing and querying support. |
Yes |
OrientDB |
OrientDB |
Apache License |
Java |
JSON over HTTP |
Yes |
RavenDB |
RavenDB |
commercial or
GNU AGPL v3.0
|
.NET |
A .NET LINQ-enabled Document Database, focused on providing high performance, transactional, schema-less, flexible and scalable NoSQL data store for the .NET and Windows platforms. |
Yes |
Redis |
|
BSD License |
ANSI C |
Key-value store supporting lists and sets with fast, simple and binary-safe protocol. |
(unknown) |
StrokeDB |
[3] |
MIT License |
|
Alpha software. |
(unknown) |
Terrastore |
|
Apache License |
Java |
JSON/HTTP |
(unknown) |
ThruDB |
|
BSD License |
C++,
Java
|
Built on top of Apache Thrift framework that provides indexing and document storage services for building and scaling websites. Alternate implementation is being developed in Java.Alpha
software. |
(unknown) |
Persevere |
Persevere |
BSD License |
|
A JSON database and JavaScript Application Server. Provides RESTful JSON interface for Create, read, update, and delete access to data. Also supports JSONQuery/JSONPath querying. |
Yes |
DBSlayer |
DBSlayer |
Apache License |
C |
database abstraction layer (overMySQL) used by the
New York Times. JSON over HTTP. |
(unknown) |
Eloquera DB |
Eloquera |
Proprietary |
.NET |
High performance. Based on
Dynamic objects. Supports LINQ, SQL queries. |
(unknown) |
XML database implementations
Main article:
XML database
All XML databases are document-oriented databases.
An XML database is a
data persistence software system that allows data to be stored in
XML format. This data can then be queried, exported and serialized into the desired format.
Two major classes of XML database exist:
-
XML-enabled: these map all XML to a traditional database (such as arelational database[4]),
accepting XML as input and rendering XML as output. This term implies that the database does the conversion itself (as opposed to relying on
middleware).
-
Native XML (NXD): the internal model of such databases depends on XML and uses XML documents as the fundamental unit of storage, which are, however, not necessarily stored in the form of text files.
Rationale for XML in databases
O'Connell (2005, 9.2) gives one reason for the use of XML in databases: the increasingly common use of XML fordata transport, which has meant that "data is extracted from databases
and put into XML documents and vice-versa". It may prove more efficient (in terms of conversion costs) and easier to store the data in XML format.
Native XML databases
The term "native XML database" (NXD) can lead to confusion. Many NXDs do not function as standalone databases at all, and do not really store the native (text) form.
The formal definition from the XML:DB initiative (which appears to be inactive since 2003[5]) states that a native XML database:
- Defines a (logical) model for an
XML document — as opposed to the data in that document — and stores and retrieves documents according to that model. At a minimum, the model must include elements, attributes,PCDATA, and document order. Examples
of such models include theXPath data model, the
XML Infoset, and the models implied by the
DOM and the events in
SAX 1.0.
- Has an XML document as its fundamental unit of (logical) storage, just as arelational database has a row in a table as its fundamental unit of (logical)
storage.
- Need not have any particular underlying physical storage model. For example, NXDs can use relational,hierarchical, or
object-oriented database structures, or use a proprietary storage format (such as indexed, compressed files).
Additionally, many XML databases provide a logical model of grouping documents, called "collections". Databases can set up and manage many collections at one time. In some implementations,
a hierarchy of collections can exist, much in the same way that an
operating system's directory-structure works.
All XML databases now[update] support at least one form of querying
syntax. Minimally, just about all of them support
XPath for performing queries against documents or collections of documents. XPath provides a simple pathing system that allows users to identify nodes that match a particular set of criteria.
In addition to XPath, many XML databases support
XSLT as a method of transforming documents or query-results retrieved from the database. XSLT provides adeclarative language written using an XML grammar. It aims
to define a set of XPathfilters that can transform documents (in part or in whole) into other formats includingPlain text,
XML, or HTML.
Many XML databases also support XQuery to perform querying. XQuery includes XPath as a node-selection method, but extends XPath to provide transformational capabilities. Users sometimes refer to its syntax as "FLWOR"
(pronounced 'Flower') because the query may include the following clauses: 'for', 'let', 'where', 'order by' and 'return'. Traditional RDBMS vendors (who traditionally had SQL only engines), are now shipping with hybrid SQL and XQuery engines. Hybrid SQL/XQuery
engines help to query XML data alongside the relational data, in the same query expression. This approach helps in combining relational and XML data.
Some XML databases support an
API called the XML:DB API (or XAPI) as a form of implementation-independent access to the XMLdatastore. In XML databases, XAPI resemblesODBC
and
JDBC as used with relational databases. On the 24th of June 2009, The
Java Community Process released the final version of the
XQuery API for Java specification (XQJ) - "a common
API that allows an application to submit queries conforming to the
W3C XQuery 1.0 specification and to process the results of such queries".
XML Databases with database APIs (XQJ, XML:DB, RESTful)
References
External references
-
XML Databases - The Business Case, Charles Foster, June 2008 - Talks about the current state of Databases and data persistence, how the current Relational
Database model is starting to crack at the seams and gives an insight into a strong alternative for today's requirements.
-
An XML-based Database of Molecular Pathways (2005-06-02) Speed / Performance comparisons of eXist, X-Hive, Sedna and Qizx/open
-
XML Native Database Systems: Review of Sedna, Ozone, NeoCoreXMS 2006
- XML Data Stores: Emerging Practices
- Bhargava, P.; Rajamani, H.; Thaker, S.; Agarwal, A. (2005) XML Enabled Relational Databases, Texas, The University of Texas at Austin.
- O'Connell, S. Advanced Databases Course Notes, Southampton, University of Southampton, 2005
- Initiative for XML Databases
- XML and Databases, Ronald Bourret, September 2005
- XML Database Products, Ronald Bourret, 2000–2009
- The State of Native XML Databases, Elliotte Rusty Harold, August 13, 2007
分享到:
相关推荐
* 文档数据库(Document-Oriented Database) * 图形数据库(Graph Database) 每种类型的 NoSQL 数据库都有其特点和应用场景。 5.5 NoSQL 的三大基石 ------------------- NoSQL 数据库的三大基石是: * BASE ...
NoSQL数据库有很多种,例如键值数据库(Key-Value Store DB)、文档数据库(Document-oriented database)、Column-family database、Graph Database等等。每种NoSQL数据库都有其自己的特点和优缺,例如Redis是一个...
2.文档型数据库(Document-Oriented):以文档的形式存储数据,适合文档化格式的存储和查询。 3.列存储数据库(Column-Family Store):将数据分割成多个列,以提高查询效率。 4.图形数据库(Graph Database):用于...
3. **文档存储数据库**(Document Store):如MongoDB、CouchDB,以文档为基本单位存储数据,支持JSON、XML等格式,适合处理半结构化数据。 4. **时序数据库**(Time Series Database, TSDB):如InfluxDB、...
NoSQL数据库类型多样,如键值存储(Key-Value Store)、列族数据库(Column Family)、文档数据库(Document-oriented)、图形数据库(Graph Database)等。其中,键值存储如Redis和Tokyo Cabinet,适用于高速读写...
- **知识点概述**:对象基础数据库(Object-Oriented Database, OODB)是一种支持面向对象编程的语言特性,允许在数据库中存储复杂的数据类型。本章介绍了 OODB 的基本概念和应用场景。 - **重要概念**: - 对象...
- **Document-Oriented Datastore**:强调MongoDB作为文档型数据库的核心特性及其优势。 - **Structuring Data for Mongo**:指导开发者如何根据业务需求合理设计数据结构,以提高存储效率和查询性能。 - **Storing ...
- 文档数据库(Document-oriented Database):如MongoDB,存储JSON、XML等文档形式的数据。 - 键值对数据库(Key-value Store):如Redis,用于高速缓存和简单数据存储。 3. 数据库设计原则: - 正确性:确保...
- **文档(Document):** 相当于关系型数据库中的一条记录,由一组键值对组成。 - **集合(Collection):** 包含多个文档,相当于关系型数据库中的表。 - **数据库(Database):** 包含多个集合,相当于关系型...
3. 文档型数据库(Document-Oriented):如MongoDB,允许存储复杂文档,便于处理JSON等格式的数据,适合内容管理和Web应用。 4. 图形数据库(Graph):如Neo4j,擅长处理关系复杂的网络数据,适用于社交网络和推荐...
- **Document-Oriented Datastore**:面向文档的数据存储。 - **Why so many "Connection Accepted" messages logged?**:连接日志解析。 - **Why are my data files so large?**:数据文件大小原因。 - **...
- 文档模型(Document Model):非关系型数据库中常见,如 JSON 或 XML 格式。 4. **数据库设计** - 数据库概念设计:根据业务需求确定实体、属性和关系。 - 逻辑设计:将概念设计转化为特定数据库管理系统支持...
Document-Oriented, Not Relational CouchDB以文档为基础进行数据组织,而不是通过表格和行。这意味着同一类型的文档并不一定具有相同的结构,这为处理现实世界中的多样化数据(如账单、信件等)提供了极大的便利...
通过DOM(Document Object Model)、SAX(Simple API for XML)或StAX(Streaming API for XML)等解析器,Java程序可以读取和写入XML文档。 **面向对象编程** 面向对象编程(Object-Oriented Programming,OOP)...
面向文档的数据存储(Document-Oriented Datastore)** 面向文档的数据存储是MongoDB的主要特征之一,这部分内容探讨了文档模型的优势和应用场景,以及如何在MongoDB中有效利用这一特性。 ### **40. “Connection ...
6. **Relational Database** - 关系型数据库:关系型数据库是一种利用表格形式组织数据的数据库类型,每个表格都包含行和列,这些表格之间通过公共字段进行关联。 7. **Software Crisis** - 软件危机:软件危机是指...
31. **Data/Database** - 数据/数据库,存储和管理信息的结构化集合。 32. **DBMS (Data Base Management System)** - 数据库管理系统,用于创建、维护和管理数据库。 33. **View/Insert/Object/Configuration** -...
16. データベース (Database) - 数据库,用于存储和管理数据的系统。 17. ドキュメント (Document) - 文档,记录信息、说明或指导的文件。 18. トラブル (Trouble) - 故障,系统或设备出现问题的状态。 19. バイト ...
4. 数据库和持久化:文档中提到了“JDBC”(Java Database Connectivity),这是一个Java API,允许应用程序执行SQL语句进行数据库操作。同时,“CMP”(Container-Managed Persistence)和“BMP”(Bean-Managed ...
- **DBMS (Database Management System)**:数据库管理系统,用于创建、管理和维护数据库的软件。 - **RDBMS (Relational Database Management System)**:关系型数据库管理系统,通过表格的形式存储和管理数据。 - ...