HBase Thrift:如何连接到远程HBase主/群集? [英] HBase Thrift: how to connect to remote HBase master/cluster?

查看:202
本文介绍了HBase Thrift:如何连接到远程HBase主/群集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

感谢Cloudera发行版本,我在本地机器上运行了HBase master / datanode + Thrift服务器,并且可以编写和测试HBase客户端程序并使用它,没有问题。



但是,我现在需要在生产环境中使用Thrift,并且我无法找到有关如何使用生产HBase集群运行Thrift的文档。



<据我所知,我需要在客户端节点上运行hbase-thrift程序,因为Thrift程序只是HBase的另一个中间客户端。

所以我猜我必须能够以某种方式指定主节点主机名/ IP到HBase-Thrift?我怎么做到这一点?



另外,关于如何在生产中扩展它的任何建议?我只需要像这样的设置:

 客户端<  - >节俭客户端<  - > HBase Master<  - >多HBase工作人员


解决方案

让它运行



您不必在本地计算机上运行Thrift服务器,它可以在任何地方运行,但区域服务器通常是个好地方*。然后在代码中连接到该服务器。



一个Python示例:

  transport = TSocket.TSocket(random-regionserver,9090)

显然用您运行Thrift服务器的服务器之一替换 random-regionserver



该服务器从通常的地方获得它的配置。如果您使用的是CDH,那么您会在 /etc/hbase/conf/hbase-site.xml 中找到配置,您需要添加一个属性 hbase.zookeeper.quorum

 < property> 
<名称> hbase.zookeeper.quorum< / name>
<值>您的zookeeper服务器列表< /值>
< / property>

当您从下载的Apache发行版启动Thrift服务器时,它与此类似,但 hbase-site.xml 可能会位于不同的目录中。



缩放



现在放大一个简单的方法是保留Thrift客户端中所有Regionservers的列表,并在连接时随机选择一个。或者您创建多个连接并每次使用一个随机连接。某些语言绑定(即PHP)具有 TSocketPool ,您可以在其中传递所有服务器。否则,您需要做一些手动工作。



使用这种技术,所有读取和写入应该或多或少地分布在群集中的Thrift服务器上。每个到达Thrift服务器的读取或写入操作都将被转换成Thrift服务器的基于Java的API调用,然后该服务器将打开一个到相应区域服务器的网络连接以执行请求的操作。

这意味着您不会像使用Java API时那样获得良好的性能。如果您自己缓存区域位置并且击中相应的Thrift服务器,它可能会有所帮助,但即使在本地服务器上结束时也会执行额外的Java API调用。 HBASE-4460 可以帮助解决这种情况,但这是 not em>包含在CDH3u4或CDH4中。



*有一个问题 HBASE-4460 ,它实际上将Thrift服务器嵌入到Regionserver中。

Thanks to the Cloudera distribution, I have a HBase master/datanode + Thrift server running on a local machine, and can code and test HBase client programs and use it, no problem.

However, I now need to use Thrift in production, and I'm not able to find documentation on how to get Thrift running with a production HBase cluster.

From what I understand, I will need to run the hbase-thrift program on the client node since the Thrift program is just another intermediate client to HBase.

So I'm guessing that I have to be able to somehow specify the master node hostname/IP to HBase-Thrift? How would I do this?

Also, any suggestions on how to scale this up in production? Do I only need a setup like this:

Client <-> Thrift client <-> HBase Master <-> Multiple HBase workers

解决方案

Get it running

You don't have to run a Thrift server on your local machine, it can run anywhere but the RegionServers are usually a good place*. In the code you then connect to that server.

A Python example:

transport = TSocket.TSocket("random-regionserver", 9090)

Where you'd obviously replace the random-regionserver with one of the servers you're running the Thrift server on.

That server gets its configuration from the usual places. If you're using CDH then you'll find the configuration in /etc/hbase/conf/hbase-site.xml and you'll need to add a property hbase.zookeeper.quorum:

<property>
  <name>hbase.zookeeper.quorum</name>
  <value>list of your zookeeper servers</value>
</property>

When you start the Thrift server from the downloaded Apache distribution this is similar except that the hbase-site.xml will probably sit in a different directory.

Scaling it up

One easy way to scale up right now is to keep a list of all the Regionservers in your Thrift client and pick one at random on connect. Or you create multiple connections and use a random one each time. Some language bindings (i.e. PHP) have a TSocketPool where you can pass in all your servers. Otherwise there's some manual work you need to do.

Using this technique all reads and writes should be more or less distributed across the Thrift servers in your cluster. Each read or write operation arriving at a Thrift server will still be translated into a Java based API call from the Thrift server which then opens a network connection to the proper Regionserver(s) to perform the requested action.

That means that you won't get as good a performance as you would when you use the Java API. It might help if you cache region locations yourself and hit the appropriate Thrift server but even then an additional Java API call will be made even if it ends up on the local server. HBASE-4460 would help with this scenario but this is not included in CDH3u4 or CDH4.

* There is an issue HBASE-4460 which actually embeds a Thrift server in a Regionserver.

这篇关于HBase Thrift:如何连接到远程HBase主/群集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆