如何访问Cassandra节点的本地数据 [英] How to access the local data of a Cassandra node

查看:248
本文介绍了如何访问Cassandra节点的本地数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从对Cassandra的一点了解来看,似乎数据本地化对访问节点的客户端应用程序来说是透明的,因为它应该。

From what little understanding of Cassandra I have, it seems that data locality is mostly transparent to the client application that accesses a node, as it should.

如果我明确地只想访问一个列族的数据是本地的节点我连接到什么?这样的事情可能吗?我没有找到一个方法,从客户端API开箱即用,但似乎我可以通过系统表获得一些这些信息,但我不知道如何做到这一点。

However, what if I explicitly only wanted to access the data of a column family that is local to the node I'm connected to? Is such a thing possible? I haven't found a way of getting this from a client API out-of-the-box, but it seems that I could get some of this information through the system tables, but I can't quite figure out how to do this.

这个想法是执行mapreduce,但不使用Hadoop。本地客户端将连接到其本地cassandra节点,对本地数据执行聚合,然后将其传回上游。

The idea is to perform mapreduce, but without using Hadoop. A local client would connect to its local cassandra node, perform aggregation on the local data and then pass it back upstream.

这样的事情可能吗?看起来,似乎可能,因为我已经看到Hadoop的证据能够使用Cassandra,但这些例子似乎面向Hadoop,而不是一个通用的客户端。本地客户端(与Casandra交谈的位)将是Java。我目前使用Hector,但我不确定是否提供任何数据位置信息。

Is such a thing possible at all? By the looks of it, it seems possible since I've seen evidence of Hadoop being able to use Cassandra, but the examples seem to be geared towards Hadoop rather than a generic client. The local client (the bit talking to Casandra) would be in Java. I'm currently using Hector, but I'm unsure whether it would provide any data locality information.

推荐答案

A 最近的文章在Netflix Techblog介绍Aegisthus,a项目,它读取集群中存储在磁盘上的SSTables,并将它们合并为单个一致的数据视图(在MapReduce中)。我想象的机制然后trivially存在为生成在单个节点上的数据的视图。

A recent article on the Netflix Techblog introduces Aegisthus, a project which reads the SSTables stored on disk across the cluster and merges them into a single, consistent view of the data (in MapReduce). I would imagine that the mechanics would then trivially exist for generating a view of the data on a single node.

不幸的是,我不认为他们已经开源工具,所以你将无法使用它。最多可以在这一点是一个闪光,是的,可以使用非Cassandra代码本机读取SSTables。

Unfortunately, I don't think they've open sourced this tool yet so you won't be able to use it. The most it can be at this point is a glimmer that yes it's possible to natively read SSTables using non-Cassandra code.

您可以使用Cassandra源,它读取SSTables,并让你想要构建的本地客户端。一个很好的起点是查看 sstable2json org.apache.cassandra.tools.SSTableExport c>工具。

You may be able to hack something together using the Cassandra source that reads SSTables and have that feed the local client you're hoping to build. A great starting point would be looking at the source of org.apache.cassandra.tools.SSTableExport which is used in the sstable2json tool.

这篇关于如何访问Cassandra节点的本地数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆