- 浏览: 784789 次
- 性别:
- 来自: 北京
文章分类
- 全部博客 (386)
- Linux (36)
- Tomcat (6)
- windows (8)
- Apache (10)
- Java (25)
- jquery (7)
- Jquery 插件 (3)
- Oracle (5)
- Oracle SQL (68)
- Spring (15)
- 开发工具 (6)
- Struts (20)
- js (14)
- Project Code (2)
- Project Code Tomcat (1)
- libset (1)
- JSP (8)
- arithmetic (2)
- 浏览器 (1)
- extjs (3)
- 学习网站 (5)
- 生活情感 (0)
- 电话号码算法 (3)
- 快捷键 (1)
- 转载 (1)
- Dos命令 (2)
- services (1)
- Resources (1)
- 行业积累 (3)
- 项目积累 (3)
- Web (3)
- 文档 (1)
- JavaEE (2)
- JSF (3)
- http (3)
- JS窗口 (1)
- Html (4)
- Flex (1)
- 资讯 (2)
- 项目规范 (1)
- Struts s:property textarea中默认值用 (1)
- Quartz 2.0.2 (12)
- 1天有多少毫秒 (1)
- 专题 (1)
- intellij idea 10 CD-KEY (1)
- restlet (4)
- Mail (1)
- Excel (3)
- Menu (1)
- Big Data技术综述 (1)
- Quart 1 (1)
- nosql (1)
- linux远程 (1)
- jdk (5)
- wind7 (1)
- 虚拟人 (0)
- 虚拟机 (1)
- 终端 (1)
- Ubuntu (16)
- Myeclipse (2)
- Wmware (1)
- eclipse (2)
- css (2)
- csv (1)
- 开源 (1)
- plsql (2)
- cassandra (4)
- maven (1)
- hadoop (2)
- mysql (1)
- spring security (1)
- tools (1)
- jdbc (2)
- exception (2)
- 硬盘数据备份 (1)
- dwr (1)
- svn (1)
- PowerDesigner15使用时的十五个问题 (1)
- tomcat 项目发部路径 (1)
- js 暂停执行 (1)
- jquery jqgrid 格式化数据显示 (1)
- js 代码模板 (1)
- strutss2 直接跳转到jsp页面 (1)
- servlet (1)
- jdbc spring (1)
- js学习网站 (1)
- 自学考试 (2)
- hibernate (2)
- eos (1)
- c (4)
- 黑马 (2)
- 大数据 (2)
- 实战云大数据案例分享 (0)
- Spark (2)
- Flink (1)
最新评论
-
coosummer:
推荐使用http://buttoncssgenerator.c ...
jquery button 漂亮 -
thinktothings:
Array_06 写道你好,我是一名刚毕业学生,我以后就是做J ...
如何转型架构师 -
thinktothings:
软考,考有职业资格证,有系统的知识体系学习
如何转型架构师 -
Array_06:
你好,我是一名刚毕业学生,我以后就是做Java的架构师,那请问 ...
如何转型架构师 -
beykery:
你这也太复杂了。。。。jsf2不应该是这样的。。。。
JSF2.0的一个简单Demo
http://wiki.apache.org/cassandra/%E9%A6%96%E9%A1%B5
http://wiki.apache.org/cassandra/%E9%A6%96%E9%A1%B5
http://wiki.apache.org/cassandra/首页首页
Cassandra documentation from DataStax
DataStax's latest Cassandra documentation covers topics from installation to troubleshooting. Documentation for older releases is also available.
Introduction
This document aims to provide a few easy to follow steps to take the first-time user from installation, to running single node Cassandra, and overview to configure multinode cluster. Cassandra is meant to run on a cluster of nodes, but will run equally well on a single machine. This is a handy way of getting familiar with the software while avoiding the complexities of a larger system.
Step 0: Prerequisites and connection to the community
Cassandra requires the most stable version of Java 1.6 you can deploy. For Sun's jvm, this means at least u19; u21 is better. Cassandra also runs on the IBM jvm, and should run on jrockit as well.
The best way to ensure you always have up to date information on the project, releases, stability, bugs, and features is to subscribe to the users mailing list (subscription required) and participate in the #cassandra channel on IRC.
Step 1: Download Cassandra Kit
-
Download links for the latest stable release can always be found on the website.
-
Users of Debian or Debian-based derivatives can install the latest stable release in package form, see DebianPackaging for details.
-
Users of RPM-based distributions can get packages from Datastax.
-
If you are interested in building Cassandra from source, please refer to How to Build page.
For more details about misc builds, please refer to Cassandra versions and builds page.
-
If you plan to run "snapshot" command on Cassandra, it will be better to install jna.jar also. Please refer to Backup Data section.
Step 2: Edit configuration files
Cassandra configuration files can be found in conf directory under the top directory of binary and source distributions. If you have installed cassandra from RPM packages, configuration files will be placed into /etc/cassandra/conf.
Step 2.1: Edit cassandra.yaml
The distribution's sample configuration conf/cassandra.yaml contains reasonable defaults for single node operation, but you will need to make sure that the paths exist for data_file_directories, commitlog_directory, and saved_caches_directory.
Verify storage_port and rpc_port are not conflict with other service on your computer. By default, Cassandra uses 7000 for storage_port, and 9160 for rpc_port. The storage_port must be identical between Cassandra nodes in a cluster. Cassandra client applications will use rpc_port to connect to Cassandra.
It will be a good idea to change cluster_name to avoid unnecessary conflict with existing clusters.
initial_token. You can leave it blank, but I recommend you to set it to 0 if you are configuring your first node.
Step 2.2: Edit log4j-server.properties
conf/log4j.properties contains a path for the log file. Edit the line if you need.
# Edit the next line to point to your logs directory log4j.appender.R.File=/var/log/cassandra/system.log
Step 2.3: Edit cassandra-env.sh
Cassandra has JMX (Java Management Extensions) interface, and the JMX_PORT is defined in conf/cassandra-env.sh. Edit following line if you need.
# Specifies the default port over which Cassandra will be available for # JMX connections. JMX_PORT="7199"
By default, Cassandra will allocate memory based on physical memory your system has. For example it will allocate 1GB heap on 2GB system, and 2GB heap on 8GB system. If you want to specify Cassandra heap size, remove leading pound sign(#) on the following lines and specify memory size for them.
#MAX_HEAP_SIZE="4G" #HEAP_NEWSIZE="800M"
If you are not familiar with Java GC, 1/4 of MAX_HEAP_SIZE may be a good start point for HEAP_NEWSIZE.
Cassandra will need more than few GB heap for production use, but you can run it with smaller footprint for test drive. If you want to assign 128MB as max, edit the lines as following.
MAX_HEAP_SIZE="128M" HEAP_NEWSIZE="32M"
If you face OutOfMemory exceptions or massive GCs with this configuration, increase these values. Don't start your production service with such tiny heap configuration!
- Note for Mac Uses:
Some people running OS X have trouble getting Java 6 to work. If you've kept up with Apple's updates, Java 6 should already be installed (it comes in Mac OS X 10.5 Update 1). Unfortunately, Apple does not default to using it. What you have to do is change your JAVA_HOME environment setting to /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home and add /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/bin to the beginning of your PATH.
Step 3: Start up Cassandra
And now for the moment of truth, start up Cassandra by invoking bin/cassandra -f from the command line1. The service should start in the foreground and log gratuitously to standard-out. Assuming you don't see messages with scary words like "error", or "fatal", or anything that looks like a Java stack trace, then chances are you've succeeded.
Press "Control-C" to stop Cassandra.
If you start up Cassandra without "-f" option, it will run in background, so you need to kill the process to stop.
Step 4: Using cassandra-cli
bin/cassandra-cli is a interactive command line interface for Cassandra. You can define schema, store and fetch data with the tool. Run following command to connect to your Cassandra instance.
bin/cassandra-cli -h host -p rpc_port example: % bin/cassandra-cli -h 127.0.0.1 -p 9160
Then you will see following cassandra-cli prompt.
Connected to: "Test Cluster" on 127.0.0.1/9160 Welcome to Cassandra CLI version 1.0.7 Type 'help;' or '?' for help. Type 'quit;' or 'exit;' to quit. [default@unknown]
You can access to the online help with 'help;' command. You need semicolon(;) at end to complete a command in cli.
[default@unknown] help;
First, create a keyspace for your test.
[default@unknown] create keyspace DEMO; f53dff10-5bd8-11e1-0000-915a024292eb Waiting for schema agreement... ... schemas agree across the cluster [default@unknown]
Don't forget to add semicolon(;) at end of the command.
Second, authenticate you to use DEMO keyspace.
[default@unknown] use DEMO; Authenticated to keyspace: DEMO [default@DEMO]
Third, create a column family Users, just for test.
[default@DEMO] create column family Users; 18a3e2d0-5bd9-11e1-0000-915a024292eb Waiting for schema agreement... ... schemas agree across the cluster [default@DEMO]
Now you can store data into Users column family.
[default@DEMO] set Users[utf8('1234')][utf8('name')] = utf8('scott'); Value inserted. Elapsed time: 10 msec(s). [default@DEMO] set Users[utf8('1234')][utf8('password')] = utf8('tiger'); Value inserted. Elapsed time: 10 msec(s). [default@DEMO]
You have inserted a row to Users column family. The row key is '1234', and we set the 2 columns in the row: column named 'name', and 'password'. 'utf8()' means to treat the data as UTF8 string. Refer to 'help set;' for more details. Now let's try to fetch data you inserted.
[default@DEMO] get Users[utf8('1234')]; => (column=6e616d65, value=73636f7474, timestamp=1330051295937000) => (column=70617373776f7264, value=7469676572, timestamp=1330051308368000) Returned 2 results. Elapsed time: 9 msec(s). [default@DEMO]
You may notice that the column name and value are not displayed as string. Use 'assume' command to let Cassandra to know the data type of the key, column name and value.
[default@DEMO] assume Users keys as utf8; Assumption for column family 'Users' added successfully. [default@DEMO] assume Users comparator as utf8; Assumption for column family 'Users' added successfully. [default@DEMO] assume Users validator as utf8; Assumption for column family 'Users' added successfully. [default@DEMO] get Users['1234']; => (column=name, value=scott, timestamp=1330051295937000) => (column=password, value=tiger, timestamp=1330051308368000) Returned 2 results. Elapsed time: 9 msec(s). [default@DEMO]
Please note that we didn't use "utf8()" for the row key this time. You can define the data type as meta data of the column family. Check 'help update column family;' and 'help create column family;' for more details.
-
Note: You can't update comparator (validation class for "column name") after creating column family. Please refer to CASSANDRA-2809.
To be certain though, take some time to try out the examples in CassandraCli before moving on Also, if you run into problems, Don't Panic, calmly proceed to If Something Goes Wrong.
-
Users of recent Linux distributions and Mac OS X Snow Leopard should be able to start up Cassandra simply by untarring and invoking bin/cassandra -f with root privileges. Snow Leopard ships with Java 1.6.0 and does not require changing the JAVA_HOME environment variable or adding any directory to your PATH. On Linux just make sure you have a working Java JDK package installed such as the openjdk-6-jdk on Ubuntu Lucid Lynx.
Configuring Multinode Cluster
Now you have single working Cassandra node. It is a Cassandra cluster which has only one node. By adding more nodes, you can make it a multi node cluster.
Setting up a Cassandra cluster is almost as simple as repeating the above procedures for each node in your cluster. There are a few minor exceptions though.
Cassandra nodes exchange information about one another using a mechanism called Gossip, but to get the ball rolling a newly started node needs to know of at least one other, this is called a Seed. It's customary to pick a small number of relatively stable nodes to serve as your seeds, but there is no hard-and-fast rule here. Do make sure that each seed also knows of at least one other, remember, the goal is to avoid a chicken-and-egg scenario and provide an avenue for all nodes in the cluster to discover one another.
In addition to seeds, you'll also need to configure the IP interface to listen on for Gossip and Thrift, (listen_address and rpc_address respectively). Use a 'listen_address that will be reachable from the listen_address used on all other nodes, and a rpc_address` that will be accessible to clients.
One other thing you need to care at multi node cluster is Token. Each node in the cluster owns a part of token range from 0 to 2^127-1. If the Nth node in the cluster has token value T(N), the node owns range from T(N-1)+1 to T(N). Cassandra decide nodes where a data should be stored based on the consistent mapping of the row key and token range (refer to RandomPartitioner, ByteOrderedPartitioner).
The token can be assigned to node by initial_token parameter in cassandra.yaml. The parameter is effective only at the first boot of the node. Once you boot a node, use 'nodetool move' command to change the assigned token. You need to specify appropriate initial_token for each node to balance data load across the nodes. Here is a python script to calculate balanced tokens.
# Number of nodes in the cluster num_node = 4 for n in range(num_node): print int(2**127 / num_node * n)
Once everything is configured and the nodes are running, use the bin/nodetool ring utility to verify a properly connected cluster. For example:
eevans@achilles:‾$ bin/nodetool -host 192.168.0.10 -p 7199 ring Address DC Rack Status State Load Owns Token 127605887595351923798765477786913079296 192.168.0.10 DC1 r1 Up Normal 17.3 MB 25.00% 0 192.168.0.11 DC1 r1 Up Normal 17.4 MB 25.00% 42535295865117307932921825928971026432 192.168.0.12 DC1 r1 Up Normal 37.2 MB 25.00% 85070591730234615865843651857942052864 192.168.0.13 DC1 r1 Up Normal 24.55 MB 25.00% 127605887595351923798765477786913079296
Advanced cluster management is described in Operations.
If you don't yet have access to hardware for a Cassandra cluster you can try it out on EC2 with CloudConfig.
For more details about configuring multi node cluster, please refer to MultinodeCluster.
Write your application
The recommended way to communicate with Cassandra in your application is to use a higher-level client. These provide programming language specific API:s for talking to Cassandra in a variety of languages. The details will vary depending on programming language and client, but in general using a higher-level client will mean that you have to write less code and get several features for free that you would otherwise have to write yourself.
That said, it is useful to know that Cassandra uses Thrift for its external client-facing API. Cassandra's main API/RPC/Thrift port is 9160. Thrift supports a wide variety of languages so you can code your application to use Thrift directly if you so chose (but again we recommend a high-level client where available).
Important note: If you intend to use thrift directly, you need to install a version of thrift that matches the revision that your version of Cassandra uses. InstallThrift
Cassandra's main API/RPC/Thrift port is 9160 by default, which is defined as rpc_port in cassandra.yaml. It is a common mistake for API clients to connect to the JMX port instead.
Checking out a demo application like Twissandra (Python + Django) will also be useful.
If Something Goes Wrong
If you followed the steps in this guide and failed to get up and running, we'd love to help. Here's what we need.
- If you are running anything other than a stable release, please upgrade first and see if you can still reproduce the problem.
-
Make sure debug logging is enabled (hint: conf/log4j.properties) and save a copy of the output.
-
Search the mailing list archive and see if anyone has reported a similar problem and what, if any resolution they received.
-
Ditto for the bug tracking system.
- See if you can put together a unit test, script, or application that reproduces the problem.
Finally, post a message with all relevant details to the list (subscription required), or hop onto IRC (network irc.freenode.net, channel #cassandra) and let us know.
相关推荐
《Getting Started with Storm》这本书是入门Apache Storm的绝佳资源,它深入浅出地介绍了这个分布式实时计算系统的原理、架构以及实际应用。Apache Storm是一个开源的流处理系统,它能够处理无限的数据流,并确保每...
"storm学习入门《Getting started with Storm》中英文版" 指的是一个关于Apache Storm的初学者教程资源,包含了该技术的入门介绍。Apache Storm是一个开源的分布式实时计算系统,用于处理流数据,即持续不断的数据流...
在本文中,我们将深入探讨如何将Spring Boot框架与Cassandra数据库集成,并利用Java Persistence API (JPA) 进行数据操作。Spring Boot以其简洁的配置和开箱即用的特性,已经成为Java开发中的首选框架之一。而...
在本文档中,标题“Learning_Apache_Cassandra”透露了内容的主题,即学习Apache Cassandra。Cassandra是一个开源的NoSQL分布式数据库管理系统,它以高可用性和分布式架构著称。该书详细介绍了Cassandra的基本概念、...
DevCenter 是一个强大的工具,专为数据科学家、开发人员和管理员设计,用于与Apache Cassandra数据库进行交互。这个工具提供了一个直观的用户界面,使得管理、查询和开发Cassandra数据库变得简单易行。Cassandra是一...
Cassandra是一款分布式、高度可扩展的NoSQL数据库系统,由Facebook于2008年开源,并在随后被Apache软件基金会接纳为顶级项目。Cassandra的设计灵感来源于Google的Bigtable,旨在处理大规模的数据存储需求,特别适合...
Cassandra(apache-cassandra-3.11.11-bin.tar.gz)是一套开源分布式NoSQL数据库系统。它最初由Facebook开发,用于储存收件箱等简单格式数据,集GoogleBigTable的数据模型与Amazon Dynamo的完全分布式的架构于一身...
Cassandra-Operator是针对Apache Cassandra在Kubernetes集群中部署和管理的一个开源项目。它使得在Kubernetes环境中运行和扩展Cassandra数据库变得更加简单和自动化。在这个压缩包“cassandra-operator,apache-...
Cassandra JDBC Driver是一款专为Apache Cassandra数据库设计的Java数据库连接(JDBC)驱动程序,它使得Java应用程序能够通过遵循标准JDBC接口的方式来访问和操作Cassandra数据。Cassandra是一款分布式NoSQL数据库...
### Cassandra概要指南 #### 一、Cassandra的诞生与背景 Cassandra作为一个高可靠性的大规模分布式存储系统,它的诞生背景源于互联网Web2.0应用的飞速发展以及云计算技术的普及。随着用户数据量的爆炸性增长和对...
**ycsb cassandra 压力测试工具** YCSB(Yahoo! Cloud Serving Benchmark)是 Yahoo 开源的一个云服务性能基准测试工具,它主要用于评估分布式数据库、键值存储和其他云服务的性能。Cassandra 是一个分布式NoSQL...
Java NoSQL Cassandra Hector详解 在当今大数据时代,非关系型数据库(NoSQL)因其灵活性、高可扩展性和高性能,越来越受到开发者的青睐。Cassandra,作为NoSQL数据库家族中的重要一员,尤其在大规模分布式存储系统...
Apache Cassandra 是一个分布式数据库系统,特别设计用于处理大规模数据,具备高可用性、线性可扩展性和优秀的性能。在这个"apache-cassandra-3.11.13"版本中,我们探讨的是Cassandra项目的其中一个稳定版本,它包含...
标题中提到的"Cassandra在饿了么的应用"意味着文章将讨论Apache Cassandra这个大规模分布式NoSQL数据库系统在著名的中国本地生活服务平台饿了么中的实际应用案例。描述中重复多次提及"Cassandra",这表明主题将专注...
The rising popularity of Apache Cassandra rests on its ability to handle very large data sets that include hundreds of terabytes -- and that's why this distributed database has been chosen by ...
Cassandra(apache-cassandra-4.0.1-bin.tar.gz)是一套开源分布式NoSQL数据库系统。它最初由Facebook开发,用于储存收件箱等简单格式数据,集GoogleBigTable的数据模型与Amazon Dynamo的完全分布式的架构于一身...
"基于Cassandra的实时气象数据分布式存储系统" 本文主要介绍了基于Cassandra的实时气象数据分布式存储系统的设计和实现。该系统采用Cassandra作为分布式存储解决方案,旨在满足气象数据存储的高可用性和性能要求。 ...
Cassandra CLI是Apache Cassandra数据库系统的一个命令行工具,它提供了与Cassandra集群交互的能力,包括连接到远程节点、创建或更新模式(schema)、设置和检索记录及列,以及查询节点和集群元数据。这个工具主要...