每当有任何新行或任何新更新时从 Cassandra 数据库中提取? [英] Pull from Cassandra database whenever any new rows or any new update is there?

查看:17
本文介绍了每当有任何新行或任何新更新时从 Cassandra 数据库中提取?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个需要在 Cassandra 数据库中存储 Avro 模式的系统.所以在 Cassandra 中我们将存储这样的东西

I am working on a system in which I need to store Avro Schemas in Cassandra database. So in Cassandra we will be storing something like this

SchemaId            AvroSchema

1                   some schema
2                   another schema

现在假设我在 Cassandra 的上表中插入另一行,现在该表是这样的 -

Now suppose as soon as I insert another row in the above table in Cassandra and now the table is like this -

SchemaId            AvroSchema

1                   some schema
2                   another schema
3                   another new schema

一旦我在上表中插入一个新行 - 我需要告诉我的 Java 程序去提取新的模式 ID 和相应的模式..

As soon as I insert a new row in the above table - I need to tell my Java program to go and pull the new schema id and corresponding schema..

解决此类问题的正确方法是什么?

What is the right way to solve these kind of problem?

我知道,一种方法是每隔几分钟进行一次轮询,假设每 5 分钟一次,我们将从上表中提取数据,但这不是解决此问题的正确方法,因为每 5 分钟一次,无论是否有任何新架构,我都在进行拉取..

但除此之外还有其他解决方案吗?

But is there any other solution apart from this?

我们可以使用 Apache Zookeeper 吗?或者 Zookeeper 不适合这个问题?或者其他解决方案?

Can we use Apache Zookeeper? Or Zookeeper is not fit for this problem? Or any other solution?

我正在运行 Apache Cassandra 1.2.9

I am running Apache Cassandra 1.2.9

推荐答案

一些解决方案:

  • 使用数据库触发器:Cassandra 2.0 有一些触发器支持,但它看起来不是最终版本,根据这篇文章可能会在 2.1 中发生一些变化:http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support.触发器是一种常见的解决方案.
  • 您提出了民意调查,但这并不总是一个糟糕的选择.特别是如果您有一些标记该行尚未被拉出的东西,那么您可以从 Cassandra 中拉出新行.如果查询成本不是很高,那么每 5 分钟拉一次对于 Cassandra 或任何数据库来说都不是负载明智的.如果很少插入新行,则此选项可能不好.
  • With database triggers: Cassandra 2.0 has some trigger support but it looks like it is not final and might change a little in 2.1 according to this article: http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-0-prototype-triggers-support. Triggers are a common solution.
  • You brought up polling but that is not always a bad option. Especially if you have something that marks that row as not being pulled yet, so you can just pull the new rows out of Cassandra. Pulling once every 5 minutes is nothing load wise for Cassandra or any database if the query is not a heavy cost. This option might not be good if new rows get inserted on a very infrequent basis.

Zookeeper 不会是一个完美的解决方案,请参阅此引用:

Zookeeper would not be a perfect solution, see this quote:

因为手表是一次性触发器,并且两者之间存在延迟获取事件并发送新请求以获取您无法获得的手表可靠地查看 ZooKeeper 中节点发生的每个更改.是准备处理znode多次变化的情况在获取事件和再次设置手表之间.(你不可以关心,但至少意识到它可能会发生.)

Because watches are one time triggers and there is latency between getting the event and sending a new request to get a watch you cannot reliably see every change that happens to a node in ZooKeeper. Be prepared to handle the case where the znode changes multiple times between getting the event and setting the watch again. (You may not care, but at least realize it may happen.)

引自:http://zookeeper.apache.org/doc/r3.4.2/zookeeperProgrammers.html#sc_WatchRememberThis

这篇关于每当有任何新行或任何新更新时从 Cassandra 数据库中提取?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆