Datastax Cassandra PIG只运行一个MAP [英] Datastax Cassandra PIG Running only one MAP

查看:187
本文介绍了Datastax Cassandra PIG只运行一个MAP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Datastax Cassandra 3.1.4和两个节点。我使用CqlStorage()运行表中有1200万行的猪,但我发现只有一个地图运行一个简单的猪命令。



我试着改变split_size in我的猪关系,但它没有奏效。



以下是我的示例查询。

  x = load'cql:// Mykeyspace / MyCF?split_size = 1000'using CqlStorage(); 
y = limit x 500;
dump y

我没有在我的mapred- site.xml我假设默认分割大小为64 * 1024

我试过 set pig.splitCombination false;



现在我们为任何记录获取513个地图,我从Hive尝试了同样的事情



我已连接来自Hive的Cassandra,并给出了一个简单的select查询,其中col1> value表中只有10条记录,但仍然运行513个地图。



感谢

解决方案

尝试以下设置:

  set pig.splitCombination false; 

默认情况下,pig会将它认为是小分割的东西合并为一张地图


I am using Datastax Cassandra 3.1.4 with two nodes. I am running pig with CqlStorage() with 12million rows in the table, but I find there is only one map running for a simple pig command.

I tried changing split_size in my pig relation but it didn't worked.

Here is my sample query.

x = load'cql://Mykeyspace/MyCF?split_size=1000' using CqlStorage();
y = limit x 500;
dump y

I didn't find input.split.size property in my mapred-site.xml I am assuming default split size is 64*1024

I tried set pig.splitCombination false;

Now its taking 513 maps for any no.of records, I tried same thing from Hive

I have connected to Cassandra from Hive and gave a simple select all query with where col1>value this table have only 10 records but still this is running 513 maps.

Please help me on this

Thanks

解决方案

Try this setting:

set pig.splitCombination false;

By default, pig will combine what it considers small splits into a single map.

这篇关于Datastax Cassandra PIG只运行一个MAP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆