`

Apache Cassandra Learning Step by Step (2): Core Concepts

阅读更多

====15 Feb 2012, by Bright Zheng (IT进行时)====

3. Core Concepts

3.1.  Keyspace

3.1.1.  Intro

A keyspace is the first dimension of the Cassandra hash, and is the container for the ColumnFamilies. Keyspaces are of roughly the same granularity as a schema or database (i.e. a logical collection of tables) in the RDBMS world. They are the configuration and management point for column families, and is also the structure on which batch inserts are applied. In most cases you will have one Keyspace for an application.

3.1.2. CLI

[default@unknown] drop keyspace Tutorial;

876ee520-571f-11e1-0000-242d50cf1ffd

Waiting for schema agreement...

... schemas agree across the cluster

[default@unknown] create keyspace Tutorial

...  with strategy_options = [{replication_factor:1}]

...  and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';

WARNING: [{}] strategy_options syntax is deprecated, please use {}

8daea060-571f-11e1-0000-242d50cf1ffd

Waiting for schema agreement...

... schemas agree across the cluster

[default@unknown] use Tutorial;

Authenticated to keyspace: Tutorial

[default@Tutorial]

3.2. Column Family

3.2.1. Intro

A column family is a container for columns, something like the TABLE in a relational system.

Model representation:

ColumnFamily

key

list

binary

1 .. * Columns

Data representation:

ColumnFamily

key

Columns

1

name

value

timestamp

 

"firstname"

"Ronald"

1270073054

 

"lastname"

"Mathies"

1270073054

 

"birthday"

"01/01/1978"

1270073054

2

name

value

timestamp

 

"firstname"

"John"

1270084021

 

"lastname"

"Steward"

1270084021

 

"birthday"

"01/01/1982"

1270084021

3.2.2. CLI

[default@Tutorial] drop column family StateCity;

StateCity not found in current keyspace.

[default@Tutorial] create column family StateCity

...         with comparator = LongType

...         and default_validation_class = 'UTF8Type'

...         and key_validation_class = 'UTF8Type';

d3cfa8f0-571f-11e1-0000-242d50cf1ffd

Waiting for schema agreement...

... schemas agree across the cluster

[default@Tutorial]

Where:

1. Comparator is used to validate and compare/sort Column names in the CF. It has following supported types:

Type

Description

BytesType

Simple non-validating byte comparison (Default)

AsciiType

Similar to BytesType, but validates that input is US-ASCII

UTF8Type

UTF-8 encoded string comparison

LongType

Compares values as 64 bit longs

LexicalUUIDType

128 bit UUID compared by byte value

TimeUUIDType

Timestamp compared 128 bit version 1 UUID

Note:

  1. The above types are changeable subject to different versions.
  2. It is also valid to specify the fully-qualified class name to a customized class that extends org.apache.Cassandra.db.marshal.AbstractType.

3.3. Column

3.3.1. Intro

A Column consists of a name, value and a timestamp.

Model representation:

Column

name

Binary

value

Binary

timestamp

i64

Data representation:

Column

name

value

timestamp

"firstname"

"Ronald"

1270073054

3.3.2. CLI

N/A

3.4. SuperColumn

3.4.1. Intro

A SuperColumn is very similar to a ColumnFamily, it consists of a key and a list of Columns.

Model representation:

SuperColumn

key

list

binary

1 .. * Columns

Data representation:

SuperColumn

key

Columns

1

name

value

timestamp

 

"firstname"

"Ronald"

1270073054

 

"lastname"

"Mathies"

1270073054

 

"birthday"

"01/01/1978"

1270073054

2

name

value

timestamp

 

"firstname"

"John"

1270084021

 

"lastname"

"Steward"

1270084021

 

"birthday"

"01/01/1982"

1270084021

The only difference to ColumnFamily is the usage. A SuperColumn is used within a ColumnFamily. So it adds an extra layer in your data structure, instead of having only a row which consists of a key and a list of columns. We can now have a row which consists of a key and a list of super columns which by itself has keys and per key a list of columns.

Once the ColumnFamily uses SuperColumn, the column type must be “Super”. By default the column type is “Standard” which means common Column.

3.4.2. CLI (TODO)

To be added here!

3.5. Others?

Please refer to http://wiki.apache.org/cassandra/API for more.

 

0
0
分享到:
评论

相关推荐

    Learning Apache Cassandra 2015

    通过学习《Learning Apache Cassandra 2015》这本书,我们不仅能够了解到Cassandra的基本概念和特性,还能深入了解如何使用Cassandra解决实际问题,包括如何设计表结构、组织数据以及执行高效查询等。此外,书中还...

    Learning_Apache_Cassandra

    在本文档中,标题“Learning_Apache_Cassandra”透露了内容的主题,即学习Apache Cassandra。Cassandra是一个开源的NoSQL分布式数据库管理系统,它以高可用性和分布式架构著称。该书详细介绍了Cassandra的基本概念、...

    Mastering.Apache.Cassandra.2nd.Edition.1784392618

    The book is aimed at intermediate developers with an understanding of core database concepts and want to become a master implementing Cassandra for their application. Table of Contents Chapter 1. ...

    Learning Apache Cassandra - Second Edition

    Learning Apache Cassandra - Second Edition by Sandeep Yarabarla English | 25 Apr. 2017 | ASIN: B01N52R0B5 | 360 Pages | AZW3 | 10.68 MB Key Features Install Cassandra and set up multi-node clusters ...

    Mastering Apache Cassandra

    ### Apache Cassandra 掌控指南 #### 一、引言 在大数据时代,高效的数据存储与管理变得至关重要。《Mastering Apache Cassandra》这本书旨在帮助读者掌握 Apache Cassandra 的核心技术和最佳实践,使其能够在处理...

    Spring Data for Apache Cassandra API(Spring Data for Apache Cassandra 开发文档).CHM

    Spring Data for Apache Cassandra API。 Spring Data for Apache Cassandra 开发文档

    Beginning Apache Cassandra Development

    Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON ...

    Expert Apache Cassandra Administration.pdf

    Apache Cassandra是一个分布式的NoSQL数据库管理系统,它被设计用来处理大量的数据跨越多个数据中心。Cassandra对高性能、高可用性、可扩展性有着出色的支持,因此它特别适合于那些需要不断增长和变化的数据集的应用...

    apache-cassandra-3.11.13

    Apache Cassandra 是一个分布式数据库系统,特别设计用于处理大规模数据,具备高可用性、线性可扩展性和优秀的性能。在这个"apache-cassandra-3.11.13"版本中,我们探讨的是Cassandra项目的其中一个稳定版本,它包含...

    Apache Cassandra 的数据库工具箱界面:使用 Apache Cassandra 数据库的数据库工具箱界面访问和导入列数据。-matlab开发

    Apache Cassandra 是一个分布式NoSQL数据库系统,以高可用性、可扩展性和高性能著称。它设计用于处理大规模数据,尤其适合大数据分析和实时应用程序。在MATLAB开发环境中,与Apache Cassandra的集成允许用户通过...

    NoSQL Web Development with Apache Cassandra(2015)

    Apache Cassandra is the most commonly used NoSQL database written in Java and is renowned in the industry as the only NoSQL solution that can accommodate the complex requirements of today’s modern ...

    Apache Cassandra

    Apache Cassandra是一个开源的分布式NoSQL数据库管理系统,它最初由Facebook开发,并在2008年被捐献给了Apache软件基金会。Cassandra旨在解决大规模数据存储的问题,特别适用于那些需要高性能、可伸缩性以及高可用性...

    Cassandra(apache-cassandra-3.11.11-bin.tar.gz)

    Cassandra(apache-cassandra-3.11.11-bin.tar.gz)是一套开源分布式NoSQL数据库系统。它最初由Facebook开发,用于储存收件箱等简单格式数据,集GoogleBigTable的数据模型与Amazon Dynamo的完全分布式的架构于一身...

    apache-cassandra-2.2.14-bin.tar.gz

    Apache Cassandra 是一个分布式数据库系统,特别适合处理大规模的数据。它以高可用性、线性可扩展性和优秀的性能而闻名。2.2.14 版本是 Apache Cassandra 的一个重要里程碑,提供了许多增强功能和修复了已知问题。...

    cassandra-operator,apache-cassandra的kubernetes算子.zip

    2. **Apache Cassandra**: Apache Cassandra是一个分布式NoSQL数据库系统,设计用于处理大规模数据,具有高可用性、线性可扩展性和容错性。在Kubernetes环境中,Cassandra-Operator可以帮助用户轻松地在多个Pod...

Global site tag (gtag.js) - Google Analytics