HBase-WAL和MemStore有什么区别? [英] HBase - What's the difference between WAL and MemStore?

查看:276
本文介绍了HBase-WAL和MemStore有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解HBase体系结构.我可以看到两个不同的术语用于同一目的.

I am trying to understand the HBase architecture. I can see two different terms are used for same purpose.

Write Ahead LogsMemstore都用于存储尚未持久保存到permanent storage的新数据.

Write Ahead Logs and Memstore, both are used to store new data that hasn't yet been persisted to permanent storage.

WAL和MemStore有什么区别?

What's the difference between WAL and MemStore?

更新:

WAL-用于在服务器崩溃时恢复尚未持久的数据. MemStore-将更新作为已排序键值存储在内存中.

WAL - is used to recover not-yet-persisted data in case a server crashes. MemStore - stores updates in memory as Sorted Keyvalue.

在将数据写入磁盘之前,似乎有很多重复的数据.

It seems lot of duplication of data before writing the data to Disk.

推荐答案

WAL用于恢复,而不用于数据复制.(更多在这里查看我的答案)

请仔细阅读以下内容以了解更多信息...

WAL is for recovery NOT for data duplication.(further see my answer here)

Pls go through below to understand more...

  • Hbase存储托管一个MemStore和0个或多个StoreFiles(HFiles).商店对应于给定区域的表的列族.

  • A Hbase Store hosts a MemStore and 0 or more StoreFiles (HFiles). A Store corresponds to a column family for a table for a given region.

预写日志"(WAL)将对HBase中数据的所有更改记录到基于文件的存储中.如果在刷新MemStore之前RegionServer崩溃或变得不可用,则WAL确保可以重播对数据的更改.

The Write Ahead Log (WAL) records all changes to data in HBase, to file-based storage. if a RegionServer crashes or becomes unavailable before the MemStore is flushed, the WAL ensures that the changes to the data can be replayed.

每个RegionServer使用单个WAL,RegionServer必须串行写入WAL,因为HDFS文件必须是顺序的.这导致WAL成为性能瓶颈.

With a single WAL per RegionServer, the RegionServer must write to the WAL serially, because HDFS files must be sequential. This causes the WAL to be a performance bottleneck.

WAL以改善性能瓶颈. 这是通过调用Hbase客户端字段完成的

WAL can be disabled to improve performance bottleneck. This is done by calling the Hbase client field

Mutation.writeToWAL(false)

一般说明 :它的一般做法是:在批量加载数据时,禁用WAL以获得速度.但是副作用是,如果禁用WAL,则万一发生内存崩溃,您将无法取回数据以进行重播.

General Note : Its general practice that while doing bulkloading data, WAL is disabled to get speed. But side effect is if you disable WAL you cant get back data to replay if in case any memory crashes.

此外,如果您使用solr + HBASE + LILY,即LILY Morphiline NRT索引与hbase一起使用,那么如果出于性能原因禁用WAL,它将在WAL上运行,那么Solr NRT索引将不起作用.因为Lily在WAL上工作.

More over if you use solr+ HBASE + LILY, i.e LILY Morphiline NRT indexes with hbase then it will work on WAL if you disable WAL for performance reasons, then Solr NRT indexing wont work. since Lily works on WAL.

请查看 Hbase体系结构部分

这篇关于HBase-WAL和MemStore有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆