试用Cassandra,其写效率太差

pangyi

浏览: 34050 次
性别:
来自: 古城西安

最近访客更多访客>>

Cheney_CC

wuxin_variable

jiuyi223

aaronwang062441

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

技术学习

Cassandra Apache MongoDB NoSQL MySQL

Cassandra是由Facebook贡献的开源分布式数据库。其遵从NoSql理念，是结合了Dynamo与BigTable的产物。最近Twitter和Digg都将其数据库由MySql迁往Cassandra。看到其发展势头不错，我就下载下来，做了个测试。

测试环境：

分别在两台机器上部署cassandra.这里说明下关键配置：
配置文件路径是%Cassandra_Home%\conf\storage-conf.xml

<Storage>
<!--两台机器的ClusterName必须相同，作为集群标识 -->
    <ClusterName>BurceServers</ClusterName>
  <AutoBootstrap>false</AutoBootstrap>

    <Keyspaces>
    <Keyspace Name="Keyspace1">
            <KeysCachedFraction>0.01</KeysCachedFraction>
            <ColumnFamily CompareWith="BytesType" Name="Standard1"/>
      <ColumnFamily CompareWith="UTF8Type" Name="Standard2"/>
      <ColumnFamily CompareWith="TimeUUIDType" Name="StandardByUUID1"/>
      <ColumnFamily ColumnType="Super"
                    CompareWith="UTF8Type"
                    CompareSubcolumnsWith="UTF8Type"
                    Name="Super1"
                    Comment="A column family with supercolumns, whose column and subcolumn names are UTF8 strings"/>
    </Keyspace>
  </Keyspaces>

    <Partitioner>org.apache.cassandra.dht.RandomPartitioner</Partitioner>

    <InitialToken></InitialToken>

  <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>

    <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackUnawareStrategy</ReplicaPlacementStrategy>

    <ReplicationFactor>1</ReplicationFactor>

  <CommitLogDirectory>c:/cassandra/lib/cassandra/commitlog</CommitLogDirectory>
  <DataFileDirectories>
      <DataFileDirectory>c:/cassandra/lib/cassandra/data</DataFileDirectory>
  </DataFileDirectories>
  <CalloutLocation>c:/cassandra/lib/cassandra/callouts</CalloutLocation>
  <StagingFileDirectory>c:/cassandra/lib/cassandra/staging</StagingFileDirectory>

<!--在这里可以添加多个cassandra服务器-->
    <Seeds>
      <Seed>10.219.101.101</Seed>
<Seed>10.219.101.121</Seed>
  </Seeds>


    <RpcTimeoutInMillis>5000</RpcTimeoutInMillis>
    <CommitLogRotationThresholdInMB>128</CommitLogRotationThresholdInMB>

<!--监听地址必须是本机IP-->
    <ListenAddress>10.219.101.101</ListenAddress>
   <StoragePort>7000</StoragePort>
    <ControlPort>7001</ControlPort>
<!--基于Thrift的cassandra客户端监听地址-->
    <ThriftAddress>10.219.101.101</ThriftAddress>
    <ThriftPort>9160</ThriftPort>
    <ThriftFramedTransport>false</ThriftFramedTransport>


    <SlicedBufferSizeInKB>64</SlicedBufferSizeInKB>

   <ColumnIndexSizeInKB>64</ColumnIndexSizeInKB>

    <MemtableSizeInMB>64</MemtableSizeInMB>
  
  <MemtableObjectCountInMillions>0.1</MemtableObjectCountInMillions>
    <MemtableFlushAfterMinutes>60</MemtableFlushAfterMinutes>

    <ConcurrentReads>8</ConcurrentReads>
  <ConcurrentWrites>32</ConcurrentWrites>

    <CommitLogSync>periodic</CommitLogSync>
    <CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
  <GCGraceSeconds>864000</GCGraceSeconds>
  <BinaryMemtableSizeInMB>256</BinaryMemtableSizeInMB>

</Storage>

除增加了一个cassandra的服务器外，基本采用默认配置。

测试代码：

/**
 * 
 */
package com.tpri.sis.test;

import java.io.UnsupportedEncodingException;
import java.nio.charset.Charset;

import me.prettyprint.cassandra.service.CassandraClient;

import org.apache.cassandra.service.Cassandra;
import org.apache.cassandra.service.ColumnPath;
import org.apache.cassandra.service.ConsistencyLevel;
import org.apache.cassandra.service.InvalidRequestException;
import org.apache.cassandra.service.TimedOutException;
import org.apache.cassandra.service.UnavailableException;
import org.apache.commons.lang.RandomStringUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.thrift.TException;
import org.apache.thrift.protocol.TBinaryProtocol;
import org.apache.thrift.protocol.TProtocol;
import org.apache.thrift.transport.TSocket;
import org.apache.thrift.transport.TTransport;
import org.apache.thrift.transport.TTransportException;

/**
 * @author brucepang
 * 
 */
public class CassandraClientDemo {

	/**
	 * 
	 */
	public CassandraClientDemo() {

	}

	/**
	 * @param args
	 */
	public static void main(String[] args) {

		try {
			TTransport tr = new TSocket("10.219.101.101", 9160);

			TProtocol pro = new TBinaryProtocol(tr);
			Cassandra.Client cli = new Cassandra.Client(pro);
			tr.open();

			String key = null;
			String name, age;
			ColumnPath namePath = new ColumnPath("Standard1", null, "name"
					.getBytes("UTF-8"));
			ColumnPath agePath = new ColumnPath("Standard1", null, "age"
					.getBytes("UTF-8"));

			String keySpace = "Keyspace1";
			long time = 0;
			long l1 = System.currentTimeMillis();
			for (int i = 0; i < 100; i++) {
				key = String.valueOf(i);
				name = RandomStringUtils.random(5,"abcdefghefsdf");
				time = System.currentTimeMillis();
				cli.insert(keySpace, key, namePath, name.getBytes("UTF-8"),
						time, ConsistencyLevel.ONE);
				cli.insert(keySpace, key, agePath, key.getBytes("UTF-8"), time,
						ConsistencyLevel.ONE);
			}
			long l2 = System.currentTimeMillis();
			long ch = l2 - l1;
			System.out.println(ch);

		} catch (TTransportException e) {
			e.printStackTrace();
		} catch (UnsupportedEncodingException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (InvalidRequestException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (UnavailableException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (TimedOutException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (TException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}

	}
}

测试结果：1 写100条数据，耗时59922毫秒，将近1分钟；

1
顶

1
踩

分享到：

几道面试题 | Spring的承诺

2010-03-16 17:24
浏览 4699
评论(9)
分类:数据库
查看更多

9 楼悬空90 2013-05-30

cassandra 1.2.5

2台linux集群

100W数据

10个线程

耗时：80 秒

还有优化的空间吗？

8 楼 eyesmore 2012-09-28

这个版本的没用过。但从原理上说，你这边配置：
<CommitLogSync>periodic</CommitLogSync>
<CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
CommitLog是异步周期持久化，数据部分都是Memtable内存的，是不可能影响写性能的。你的实验有问题，或者之前的版本有问题。

7 楼 linliangyi2007 2010-07-10

楼主杯具了，cassandra被你用成残废了。你的配置中seed为啥又两个啊，做群集seed 1个就够了。

6 楼 machoche 2010-06-07

有些库找不到？

5 楼 pangyi 2010-05-17

最近试用了下MongoDB,其读写效率都不错。

4 楼 zdyhlp 2010-04-19

我在windows机器上单节点试了下。4秒写了10000条记录。

TTransport tr = new TSocket("localhost", 9160);

TProtocol pro = new TBinaryProtocol(tr);
Cassandra.Client cli = new Cassandra.Client(pro);
tr.open();

String key = null;
String name, age;
ColumnPath namePath = new ColumnPath("Standard1");
namePath.setColumn("name".getBytes("UTF-8"));
ColumnPath agePath = new ColumnPath("Standard1");
agePath.setColumn("age".getBytes("UTF-8"));

3 楼 waterdh 2010-04-02

可能机器问题吧。
我的2台集群测试下来效果还不错，单线程插入2w条数据花了13s。

2 楼 pangyi 2010-03-22

单机效率还可以。cassandra的主要特性是其分布特性。

1 楼 wing5jface 2010-03-19

如果是单机的话，我跑你的程序，只用了94ms呀

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论