为什么我添加节点时我的cassandra吞吐量没有提高？ [英] Why is my cassandra throughput not improving when I add nodes?

查看：146 发布时间：2016/11/13 14:05:11 cassandra

本文介绍了为什么我添加节点时我的cassandra吞吐量没有提高？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是一个新手问题。我试图做我的家庭作业，但我被卡住试图学习如何cassandra将如广告一样线性缩放。当我针对单个cassandra节点运行时，我获得合理的插入率。以下是一些相关的信息：

this is a newbie question. I have tried to do my homework, but I am stuck trying to learn how cassandra will scale linearly as advertized. When I run against a single cassandra node, I get reasonable insert rates. Here are some relevant bits of information:

CentOS 6.5

java 1.7.0_71

cassandra 2.1.4二进制下载

不同驱动器上的数据和commitlog

compaction_throughput_mb_per_sec：0

10,000,000次插入

插入率：〜110K次插入

尚未实现这些设置，因为我不感兴趣的东西快得像在观察线性缩放。

CentOS 6.5
java 1.7.0_71
cassandra 2.1.4 binary download
data and commitlog on different drives
compaction_throughput_mb_per_sec: 0
10,000,000 inserts
Insertion rate: ~110K inserts/s
Have not implemented these settings yet, since I am not interested in making things blazing fast as much as in observing linear scaling.

我的键空间定义如下：

create keyspace nms WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 };
use nms;
CREATE TABLE RN(tableId int, sampleTime timestamp, sampleValue bigint, sampleStdev bigint, sampleRate bigint, tz_offset int,
       PRIMARY KEY (tableId, sampleTime));

我的相关的java代码看起来像这样（大致）：

My relevant java code looks like this (roughly):

cluster = Cluster.builder().addContactPoint("138.42.229.240")
                .withQueryOptions(new QueryOptions().setConsistencyLevel(ConsistencyLevel.ANY))
                .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
                .withLoadBalancingPolicy(new TokenAwarePolicy(new RoundRobinPolicy()))
                .build();
session = cluster.connect("nms");
batch = new BatchStatement();
statement = session.prepare("INSERT INTO RN" +
            "(tableId, sampleTime, sampleValue, sampleStdev, sampleRate, tz_offset)" +
            "VALUES (?, ?, ?, ?, ?, ?);");

我插入32个tableIds（分区键），每个由单个线程拥有 sampleTimes。其他数据是填充垃圾。

I am inserting 32 tableIds (partition key), each "owned" by a single thread, and unique sampleTimes. The other data is filler junk.

我发现每个批次和10个executeAsync（）调用组的甜蜜点是〜10次插入。

I found the sweet spot to be ~10 inserts per batch and 10 executeAsync() call groups.

到目前为止很好。现在，添加了4个节点，在SSD SAN上运行硬件和3个虚拟机（我不知道）。我为每个节点使用类似的配置，如上所述，并运行我的简单测试期待一些改进。插入率不变。我不能解释这一点。我本来希望一些改进。此外，速率在2,3,4和5个节点处大体上保持不变。我意识到，奇数可能没有意义，但我绝望。

So far so good. Now, added 4 nodes, scrounging hardware and 3 VMs running on an SSD SAN (not ideal, I know). I used similar configuration for each node as what I described above and ran my simple test expecting some improvements. The insertion rate was unchanged. I cannot explain that. I would have expected some improvement. Moreover, the rate remains largely unchanged with 2, 3, 4 and 5 nodes. I realize that odd numbers probably make no sense, but I was desperate.

然后我尝试设置的keyspace与复制因子为零。我的数据速率降至1K插入/秒。我不能解释这个。

I then tried setting up the keyspace with a replication factor of zero. My data rates went down to 1K inserts/s. I cannot explain this. I must be missing something really obvious, but I cannot see it.

为什么我添加节点时我的cassandra吞吐量没有提高？ [英] Why is my cassandra throughput not improving when I add nodes?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么我添加节点时我的cassandra吞吐量没有提高？ [英] Why is my cassandra throughput not improving when I add nodes?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭