任何新行或任何新的更新时,从Cassandra数据库拉? [英] Pull from Cassandra database whenever any new rows or any new update is there?

查看:269
本文介绍了任何新行或任何新的更新时,从Cassandra数据库拉?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个系统上工作,我需要在Cassandra数据库中存储Avro Schemas。因此,在Cassandra中,我们将存储如下

I am working on a system in which I need to store Avro Schemas in Cassandra database. So in Cassandra we will be storing something like this

SchemaId            AvroSchema

1                   some schema
2                   another schema

现在假设在Cassandra中插入另一行表格如下 -

Now suppose as soon as I insert another row in the above table in Cassandra and now the table is like this -

SchemaId            AvroSchema

1                   some schema
2                   another schema
3                   another new schema

只要我在上表中插入一行 - 我需要告诉我的Java程序去拉新的模式id和相应的模式。

As soon as I insert a new row in the above table - I need to tell my Java program to go and pull the new schema id and corresponding schema..

什么是正确的方法来解决这些问题?

What is the right way to solve these kind of problem?

我知道,一种方法是每隔几分钟轮询一次,假设每5分钟我们将从上表中提取数据,但这是不是每5分钟解决这个问题的正确方法,我正在做一个拉,无论是否有任何新的模式..

除此之外的任何其他解决方案?

But is there any other solution apart from this?

我们可以使用Apache Zookeeper吗?或者Zookeeper不适合这个问题?
或任何其他解决方案?

Can we use Apache Zookeeper? Or Zookeeper is not fit for this problem? Or any other solution?

我正在运行Apache Cassandra 1.2.9

I am running Apache Cassandra 1.2.9

推荐答案

一些解决方案:


  • 使用数据库触发器:Cassandra 2.0有一些触发器支持,并且根据本文可能在2.1中改变一点: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support 。触发器是一种常见的解决方案。

  • 您启动了轮询,但这并不总是一个不好的选择。特别是如果你有一些东西,标记该行,因为没有被拉,所以你可以从Cassandra拉新行。每5分钟拉一次对于Cassandra或任何数据库来说都不是什么负担,如果查询不是很昂贵的话。

  • With database triggers: Cassandra 2.0 has some trigger support but it looks like it is not final and might change a little in 2.1 according to this article: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support. Triggers are a common solution.
  • You brought up polling but that is not always a bad option. Especially if you have something that marks that row as not being pulled yet, so you can just pull the new rows out of Cassandra. Pulling once every 5 minutes is nothing load wise for Cassandra or any database if the query is not a heavy cost. This option might not be good if new rows get inserted on a very infrequent basis.

Zookeeper不是一个完美的解决方案,请参阅这个选项

Zookeeper would not be a perfect solution, see this quote:


因为手表是一次性触发器,并且在
之间有延迟获取事件并发送一个新请求看看你不能
可靠地看到在ZooKeeper中的一个节点发生的每一个变化。 Be
准备处理znode在获取事件和再次设置手表之间多次改变
的情况。 (您可能不会
保健,但至少意识到可能会发生。)

Because watches are one time triggers and there is latency between getting the event and sending a new request to get a watch you cannot reliably see every change that happens to a node in ZooKeeper. Be prepared to handle the case where the znode changes multiple times between getting the event and setting the watch again. (You may not care, but at least realize it may happen.)

报价来自: http://zookeeper.apache.org/doc/r3.4.2/zookeeperProgrammers.html#sc_WatchRememberThese

这篇关于任何新行或任何新的更新时,从Cassandra数据库拉?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆