HBase写道:哪一个更适合于性能,批处理或者放(List< Put>)? [英] HBase write: which one better on performance, batch or put(List<Put>)?

查看:653
本文介绍了HBase写道:哪一个更适合于性能,批处理或者放(List< Put>)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始学习HBase来写数据流。我使用HTableInterface并且在性能方面存在问题。花了很多时间插入500行,每插入一个批次的列表,我插入了将近500,000ms。



使用HTableInterface批量写入HTable的任何示例或建议?
我使用HBase 0.94



谢谢 解决方案

它们基本上是相同的:批处理(List< ;? extends Row> actions,Object [] results)不仅允许put,还可以获取,删除,增量... put(List< Put> puts)只需做一批投入(它也验证它们是客户端)。



您也可以通过禁用 table.setAutoFlush(false)执行批处理,向表格发出标准放入并清除之后的缓冲区使用 table.flushCommits()



我不知道行的大​​小,但除非他们是巨大的,看起来你的配置有问题(网络延迟可能?),即使执行500个放行也应该快得多。


I am starting to learn HBase to write data streams. I use HTableInterface and having problem in performance. It took much times to insert only 500 rows, almost 500,000ms per batch List that I inserted.

Any example or suggestion for batch write into HTable with HTableInterface ? I am using HBase 0.94

Thanks

解决方案

They're essentially the same: batch(List<? extends Row> actions, Object[] results) allows not only puts but also gets, deletes, increments... put(List<Put> puts) just do a batch of puts (it also validates them client-side).

You can also perform batches by disabling table.setAutoFlush(false), issuing standard puts to the table and flushing the buffer afterwards with table.flushCommits().

I don't know the size of your rows but unless they're huge it seems you have some sort of problem with your configuration (network latency maybe?), even performing 500 puts row by row should be performed a lot faster.

这篇关于HBase写道:哪一个更适合于性能,批处理或者放(List&lt; Put&gt;)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆