In the NoSQL world it is common to talk about schemaless databases or data models.
It would be more precise to say “dynamic schema”. In MongoDB, there
are databases; a system catalog of collections; documents within
collections; explicitly declared indexes for a collection. The big
difference is that “columns”, or rather fields in the document data
model, are not predeclared. Each field/value in the document is dynamic
and can be present or missing. Each value has a datatype too, so it
isn’t typeless but rather dynamic or what some might call duck typing.
Here’s an example in the mongo shell. We may have a couple docs:
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }
We could then add a new person with an extra attribute:
> db.persons.insert({name:’julie’,age:28,likes:’baseball’})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
No “alter table” necessary. This is very helpful with agile development methodologies.
We can take it a step further however. The value of a field need not
be consistent from document to document. Now, in practice, it is very
very common for the contents of a collection to be homogeneous. But we
have the option. For example suppose we want to add “likes” for ben,
but ben likes a couple things. What to do?
> db.persons.update({name:’ben’},{$set:{likes:[‘math’,’baseball’]}})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }
In this example, things work out particularly elegantly as even
though one likes value is an array, and the other a string, we can still
do some queries across them that are interesting. This is because when
querying for a value, if the value is an array, MongoDB looks into the
array:
> db.persons.find({likes:’baseball’})
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }
Likewise we can index the field:
> db.persons.ensureIndex( { likes : 1 } )
All very handy and useful. But you might ask “won’t my data get
rather dirty with no schema constraints?” I had this concern when we
started; I assumed we would just add some constraint rules later when
needed. Oddly, there hasn’t been a lot of demand for the feature, so
far. Empirically, it seems the data doesn’t get too noisy.
One other very important note: the dynamic schema is not just for
developer friendliness! There is another good reason for it. Imagine
changing the schema in a database cluster involving 2,000 servers. It
might be tricky to change that global state globally in a consistent
manner. One goal here is to store very big data sets. Alter table is
probably not going to fly with billions or trillions of documents.
P.S. For compactness, the examples above do not show the _id field MongoDB or its driver automically adds to all documents.
P.P.S. Dynamic schema is not unique to MongoDB — some other products
in the space do it too…of course I’m biased this is my favorite.
相关推荐
schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 schemaless的类sql分布式查询系统 ...
这是MIT许可的Uber Schemaless(不变的BigTable样式分片MySQL / Postgres)的开源实现。 将其视为您自己的分片数据存储API和基础结构的潜在构建块。 github问题列表描述了有意保留的未实现的内容,以及该实现与Uber...
schemaless-graphql-neo4j 将无类型的动态GraphQL查询转换为Cypher。 签出,以更好地查看您可以编写的查询。入门$ npm install schemaless-graphql-neo4j :warning: 图书馆尚未发布操场您可以开始使用开发人员游乐场...
* 存储:Schemaless(基于 MySQL 的自研存储系统)、MySQL、Cassandra、Redis/Twemproxy * 数据:Hadoop、Spark、Hive、Presto、Vertica * 队列:Cherami、Kafka * 搜索和日志:Elasticsearch、Logstash、Kibana * ...
灵活的索引选项:支持无模式(Schemaless)和严格模式(Strict Schema)索引,以适应不同的数据结构需求。 云存储上的亚秒级搜索:能够在Amazon S3、Azure Blob Storage、Google Cloud Storage等云存储服务上实现...
MongoDB is an open source, schemaless NoSQL database system. Pentaho as a famous open source Analysis tool provides high performance, high availability, and easy scalability for large sets of data. ...
InfluxDB服务端和客户端最新下载,主要是centos系统环境; 官网下载不太方便,下载下来后方便大家使用 ---- InfluxDB是一个由...schemaless(无结构),可以是任意数量的列 Scalable可拓展 一系列函数,方便统计
支持 schemaless ("无模式")的写入方式,支持历史数据补录(含乱序写入)。云原生: CnosDB 有原生的分布式设计、数据分片和分区、存算分离、Quorum 机制、Kubernetes 部署和完整的可观测性,具有最终一致性,能够...
例如,你可以使用`fastavro.parse_schema()`来解析Avro模式,或者使用`fastavro.schemaless_reader()`和`fastavro.schemaless_writer()`处理未附带schema的数据。 总结起来,fastavro库是Python处理Avro数据的优秀...
5. **Schemaless模式**:为了简化设置过程,Solr 7.6.0引入了Schemaless模式,允许用户在不定义严格Schema的情况下快速启动和测试搜索服务。系统会自动推断字段类型,但生产环境中仍推荐使用预定义的Schema以确保...
Uber 的技术栈包括 Storage(Schemaless、MySQL、Cassandra、Redis/Twemproxy)、Data(Hadoop、Spark、Hive、Presto、Vertica)、Queue(Cherami、Kafka)、Search & Logging(Elasticsearch、Logstash、Kibana)等...
同时,它有以下几大特点: schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
schemaless(无结构),可以是任意数量的列; min, max, sum, count, mean, median 一系列函数,方便统计; Native HTTP API, 内置http支持,使用http读写; Powerful Query Language 类似sql; Built-in Explorer ...
HBase以其无模式(schemaless)的设计,能够灵活应对异构数据的存储,其分布式特性使得处理大量日志数据成为可能。Hadoop与MapReduce技术被用于进一步处理这些大数据集,通过分布式计算集群,实现对巨量日志数据的...