HBase是否每行最大尺寸? [英] Does HBase impose a maximum size per row?
问题描述
高级问题:HBase是否对所有发行版都施加了最大大小(对于所有发行版来说都是共同的(因此不是实现的人为因素)),无论是字节存储或单元格数?如果是这样的话: >
-
什么是限制?
是否存在限制的原因?
-
限制在哪里记录?
b
如果不是:
-
测试)可用,证明HBase处理超过2GB的行的能力? 4GB?HBase API用户应该保持行大小以避免严重的性能下降,这是否有一个实际或最佳实践?如果是这样,那么如果指导被放弃,会出现什么样的性能下降?
无论哪种情况:
- 答案取决于相关的HBase版本吗?
背景:
- 至少有一个HBase API实现似乎强加了一个限制;使用MapR专有的MapR-FS作为表格下的存储层的 MapR Tables ,看起来像强加每行2GB的硬限制和默认为32MB的可配置软限制。 HBase API的其他流行实现是否也施加了这样的限制? 负载平衡和分配的单位是区域,并且行将永远不会跨区域分开。 在一个区域内是否存在行的要求是否会对行大小造成严格的限制,或者实际的限制是否会导致性能下降严重?
$ ul>
解决方案必须将一行放入一个要分配给区域服务器并复制的区域文件中。区域文件大小可以通过hbase.hregion.max.filesize进行配置。
这个页面表示它将是10gb default / max http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
这个页面表示它可以设置为100gb 禁止自动分割,请将hbase.hregion.max.filesize设置为
非常大的值,例如100 GB。不建议将其设置为
,其绝对最大值为Long.MAX_VALUE。
http://hbase.apache.org/book.html#important_configurations
High-Level Question:
Does HBase impose a maximum size per row which is common to all distributions (and thus not an artifact of implementation), either in terms of bytes-stored or in terms of number of cells?
If so:
What is the limit?
What is the reason the limit exists?
Where is the limit documented?
If not:
Is documentation (or results of a test) available demonstrating the ability of HBase to handle rows in excess of 2GB? 4GB?
Is there a practical or "best practice" maximum under which HBase API users should keep row sizes in order to avoid severe performance degradation? If so, what kind of performance degradation can occur if that guidance is discarded?
In either case:
- Does the answer depend on the HBase version in question?
Background:
- At least one implementation of the HBase API does appear to impose a limit; MapR Tables, which uses MapR's proprietary MapR-FS as the storage layer underlying the tables, appears to impose a hard limit of 2GB per row and a configurable soft limit which defaults to 32MB. Do other popular implementations of the HBase API also impose such a restriction?
- This Quora response from HBase committer Todd Lipcon in 2011 suggests the absence of a limit in terms of number of cells. However, it also indicates that "the unit of load balancing and distribution is the region, and a row will never be split across regions". Does the requirement that a row exist within a single region impose either a hard limit on the row size, or a practical limit, past which performance degradation becomes severe?
解决方案 One row must be fit into one Region file to be assigned to a region server and replicated. Region file size is configurable by "hbase.hregion.max.filesize"
this page says it will be 10gb default/max http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
This page says it can be set as 100gb
To disable automatic splitting, set hbase.hregion.max.filesize to a
very large value, such as 100 GB It is not recommended to set it to
its absolute maximum value of Long.MAX_VALUE.
http://hbase.apache.org/book.html#important_configurations
这篇关于HBase是否每行最大尺寸?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
什么是限制?
是否存在限制的原因?限制在哪里记录?
如果不是: 测试)可用,证明HBase处理超过2GB的行的能力? 4GB?HBase API用户应该保持行大小以避免严重的性能下降,这是否有一个实际或最佳实践?如果是这样,那么如果指导被放弃,会出现什么样的性能下降? 无论哪种情况: 必须将一行放入一个要分配给区域服务器并复制的区域文件中。区域文件大小可以通过hbase.hregion.max.filesize进行配置。 这个页面表示它将是10gb default / max http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ 这个页面表示它可以设置为100gb 禁止自动分割,请将hbase.hregion.max.filesize设置为 Does HBase impose a maximum size per row which is common to all distributions (and thus not an artifact of implementation), either in terms of bytes-stored or in terms of number of cells? If so: What is the limit? What is the reason the limit exists? Where is the limit documented? If not: Is documentation (or results of a test) available demonstrating the ability of HBase to handle rows in excess of 2GB? 4GB? Is there a practical or "best practice" maximum under which HBase API users should keep row sizes in order to avoid severe performance degradation? If so, what kind of performance degradation can occur if that guidance is discarded? In either case:
背景:
非常大的值,例如100 GB。不建议将其设置为
,其绝对最大值为Long.MAX_VALUE。
http://hbase.apache.org/book.html#important_configurations
High-Level Question:
Background:
One row must be fit into one Region file to be assigned to a region server and replicated. Region file size is configurable by "hbase.hregion.max.filesize"
this page says it will be 10gb default/max http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
This page says it can be set as 100gb
To disable automatic splitting, set hbase.hregion.max.filesize to a very large value, such as 100 GB It is not recommended to set it to its absolute maximum value of Long.MAX_VALUE. http://hbase.apache.org/book.html#important_configurations
这篇关于HBase是否每行最大尺寸?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!