Scala(或Java)对HBase访问进行烫染的替代方法 [英] Alternatives to scalding for HBase access from Scala (or Java)

查看:223
本文介绍了Scala(或Java)对HBase访问进行烫染的替代方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以推荐一个好的解决方案(框架)来访问 HBase 来自 Scala 的/hadoop.apache.org/\"rel =nofollow> Hadoop 群集(或 Java )申请?



By现在我正在进行烫伤方向。我获得的原型允许我将烫洗库与 Maven 结合起来,并将scilding job JAR与库代码包分开。这反过来又允许我从外部集群运行基于scalding的Hadoop作业,每个作业的开销最小('库'代码只在更改时才发布到集群'分布式缓存'(很少需要),因此我可以快速加载作业代码) 。



现在我实际上开始玩HBase本身,我看到烫伤是好的,但它对HBase来说不是那么原生。是的,有像 hbase-scalding 之类的东西,但是我还是有一点点计划未来的行动我我想知道我可能错过的其他好的解决方案。



预期结果:




  • 启动开销的应用程序(作业)应该很低。我需要运行它们。

  • 应该可以(更容易 - 更好)在没有任何SSH的情况下从外部集群运行作业(仅基于'hadoop jar'命令,甚至简单地通过应用程序执行)。

  • 作业语言本身应该允许短的逻辑语义。理想情况下,此代码应该足够简单,以便自动生成。

  • 此解决方案应该在足够大的HBase表(最初高达100.000.000个条目)上高效。

  • 好的,解决方案应该是活的(正在积极开发),但在总体稳定性方面相对较好。



I认为这里的论证可能比解决方案本身更有用,这个问题应该为许多人添加一些想法。
任何建议和平?

解决方案

HPaste http://www.gravity.com/labs/hpaste/ 可能就是您要找的。

Could anybody please recommend good solution (framework) to access HBase on Hadoop cluster from Scala (or Java) application?

By now I'm moving in scalding direction. Prototypes I obtained allowed me to combine scalding library with Maven and separate scalding job JAR from 'library' code packages. This in turn allowed me to run scalding based Hadoop jobs from outside cluster with minimal overhead per job ('library' code is posted to cluster 'distributed cache' only when it changes (which is rarely needed) so I can load jobs code fast).

Now I'm actually starting to play with HBase itself and I see scalding is good but it is not so 'native' to HBase. Yes, there are things like hbase-scalding but as I anyway have some point to plan future actions I'd like to know about other good solutions I probably missed.

What is expected:

  • Applications (jobs) starting overhead should be low. I need to run lot of them.
  • It should be possible (easier - better) to run jobs from outside cluster without any SSH (just based on 'hadoop jar' command or even simply by application execution).
  • Jobs language itself should allow short, logical semantic. Ideally this code should be simple enough to be automatically generated.
  • This solution should be productive on big enough HBase tables (initially up to 100.000.000 entries).
  • OK, solution should be 'live' (being actively developing) but relatively good in terms of general stability.

I think argumentation here could be even more useful than solution itself and this question should add couple of ideas for many people. Any peace of advice?

解决方案

HPaste http://www.gravity.com/labs/hpaste/ may be what you are looking for.

这篇关于Scala(或Java)对HBase访问进行烫染的替代方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆