为什么我们要同时刷新HBase中的所有MemStore? [英] Why do we flush all MemStores in HBase at the same time?

查看:81
本文介绍了为什么我们要同时刷新HBase中的所有MemStore?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里浏览了一些HBase体系结构注释: https://mapr.com/blog/in-depth-look-hbase-architecture/并说

每个列族只有一个MemStore;当一个装满时,它们全部冲洗干净.它还会保存最后写入的序列号,以便系统知道到目前为止所保留的内容.

我的问题有两个方面.

  • 为什么我们要立即刷新所有MemStore?我们不能只刷新已满的MemStore吗?假设我们有两个MemStore: 1 2 .如果刷新了 1 而不是将来的Gets,我们仍然可以在检查磁盘(HFiles)中的 2 列族之前检查 2 ,对吗?/li>
  • 最后写入的序列号"是什么意思?我正在尝试可视化刷新MemStores的方式,但也许可以使用一个可视示例.假设我有MemStore 1 和行键 a b d ,然后将它们刷新.什么是最后写入的序列号"?

解决方案

让我们从HBase如何处理写操作开始.对HBase进行写操作时,它将执行以下操作(简化视图):

  • 将KV附加到WAL中
  • fsync WAL
  • 应用写入MemStore

每个写操作都用序列号"标记.这是 MVCC 交易ID.来自HBase文档的报价:

给每个细胞的区域特异性的唯一单调增加的序列ID.它对于内存存储区中的单元始终存在,但不会永远保留.

序列号与新的KV一起作为写入操作的一部分写入WAL.成功写入WAL之后,HBase将更改应用于 MemStore ,并就成功写入对客户端做出响应.从这一点来看,如果 RegionServer 去世,新的KV将会保留,并且不会丢失.

由于每次写入都会增加WAL的大小,因此HBase应该将其截断以减少磁盘使用量.为了完成这项工作,WAL必须确保将其条目所描述的更改持久地保留在磁盘上(以免在服务器崩溃时不会丢失更新).为此,WAL跟踪属于RegionServer的每个区域的上述最后写入序列号"(LWSN).

这些LWSN代表刷新到磁盘的最新写入.所有具有更大 seqnum 的写操作仅驻留在MemStore中,尚未驻留在磁盘上.WAL使用区域的LWSN的值来查找"seqnum"小于区域的LWSN的条目.此类条目可以从WAL中删除,因为它们已刷新到磁盘,并且在服务器崩溃期间不会丢失.

让我们看看HBase如何跟踪LWSN的示例.假设您有2个列族"a"和"b".您执行200次写入操作:前100次将写入'a',其他100次将写入'b'.与家族'a'相关的运算的'seqnum'在[1..100]范围内,而对于'b'的运算在[101..201]范围内.假设写入"b"的大小更大,并导致刷新"b"的MemStore,但不会刷新"a".在刷新期间,HBase应该更新区域的LWSN.将其更新为201的值是不正确的,因为使用'seqnum's [1..100]的写入不会持久化(并且不能从WAL中截断).

这就是HBase一次刷新所有列族的MemStore的原因:如果仅刷新完整的MemStore,它将无法更新region的LWSN并会延迟WAL截断(如果发生崩溃,可能会导致长时间的服务器修复).

I'm going through some HBase Architecture notes here: https://mapr.com/blog/in-depth-look-hbase-architecture/ and saw it said

There is one MemStore per Column Family; when one is full, they all flush. It also saves the last written sequence number so the system knows what was persisted so far.

My question is two-fold.

  • Why do we flush all MemStores at once? Couldn't we just flush the MemStore that's full? Let's say we have two MemStores: 1 and 2. If 1 is flushed than for future Gets we can still check 2 before checking disk (HFiles) for 2's Column Family, right?
  • What does "last written sequence number" mean? I'm trying to visualize how flushing MemStores would happen but maybe a visual example would help. Let's say I have MemStore 1 with row keys a, b, and d and I flush them. What's the "last written sequence number"?

解决方案

Let's start from how write operations handled by HBase. When you performing a write to HBase, it will do following(simplified view):

  • append KV write to WAL
  • fsync WAL
  • apply write to MemStore

Each write operation is marked by 'sequence number'. This is some sort of MVCC transaction ID. Quote from HBase docs:

A region-specific unique monotonically increasing sequence ID given to each Cell. It always exists for cells in the memstore but is not retained forever.

Sequence number is written into WAL as part of write operation along with new KV. After successful write into WAL, HBase applies changes into MemStore and respond to client about successful write. From this point, new KV persisted and will not be lost if RegionServer dies.

Because each write is increase size of WAL, HBase should truncate it to reduce disk usage. To accomplish this job, WAL must ensure that changes described by it's entries are durably persisted to disk(to not lose updates if server will crash). For that purpose, WAL tracks aforementioned "last written sequence number"(LWSN) of each region which belongs to RegionServer.

These LWSN represent most recent writes which was flushed to disk. All write operations with greater seqnum reside only in MemStore, not on disk yet. WAL uses value of region's LWSN to find entries which 'seqnum' is less that regions's LWSN. Such entries can be removed from WAL because they were flushed to disk and will not be losed during server crash.

Let's see example of how LWSN is tracked by HBase. Suppose you have a 2 column families 'a' and 'b'. You perform 200 write operations: first 100 will be written to 'a' and other 100 to 'b'. 'seqnum''s of operations related to col.family 'a' is in range [1..100] and for 'b' will be [101..201]. Suppose writes to 'b' is more heavy sized and cause a flush of MemStore of 'b', but not an 'a'. During this flush, HBase should update LWSN of region. It's not correct to update it to value of 201, because writes with 'seqnum's [1..100] are not persisted(and must not be truncated from WAL).

That's why HBase flushes MemStores of all column families at once: if it flushes only full MemStore, it can't update LWSN of region and will delay WAL truncation(which can cause long server repair in case of crash).

这篇关于为什么我们要同时刷新HBase中的所有MemStore?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆