使用cqlsh复制非常大的Cassandra表时出现PicklingError [英] PicklingError when copying a very large cassandra table using cqlsh

查看:115
本文介绍了使用cqlsh复制非常大的Cassandra表时出现PicklingError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试使用命令将表复制到cassandra时:

When I try to copy a table to cassandra using the command:

copy images from 'images.csv'

我收到错误:

'PicklingError: Can't pickle <class 'cqlshlib.copyutil.ImmutableDict'>: attribute lookup cqlshlib.copyutil.ImmutableDict failed'

我已经成功导入了所有其他表,但是该表不起作用。唯一的区别是它包含用于图像的大型二进制斑点。

I have successfully imported all of my other tables, but this one is not working. The only difference with this one is that it contains large binary blobs for images.

这是csv文件中的示例行:

Here is a sample row from the csv file:

b267ba01-5420-4be5-b962-7e563dc245b0,,0x89504e...[large binary blob]...426082,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,pentium_e6600,01fa819e-3425-47ca-82aa-a3eec319a998,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,,,png,0

这是导致错误的文件:
https://www.dropbox.com/s/5mrl6nuwelpf3lz/images.csv?dl=0

And here is the file that causes the error: https://www.dropbox.com/s/5mrl6nuwelpf3lz/images.csv?dl=0

这里是我的架构:

CREATE TABLE dealtech.images (
    id uuid PRIMARY KEY,
    attributes map<text, text>,
    data blob,
    height int,
    item_id uuid,
    name text,
    product_id uuid,
    scale double,
    seller_id uuid,
    text_bottom int,
    text_top int,
    type text,
    width int
)

表已导出使用 cassandra 2.x ,而我目前正在使用 cassandra 3.0.9 导入它们。

The tables were exported using cassandra 2.x and I am currently using cassandra 3.0.9 to import them.

推荐答案

尽管我的数据集很小(一个表中有46行,另一表中有262行),但我还是遇到了apache cassandra 3.9的同一问题。

I ran into this same issue with apache cassandra 3.9, although my datasets were fairly small (46 rows in one table, 262 rows in another table).

PicklingError:无法腌制< class'cqlshlib.copyutil.link'> ;:属性查找cqlshlib.copyutil.link失败

PicklingError:无法腌制< class'cqlshlib.copyutil.attribute'> ;:属性查找cqlshlib.copyutil .attribute失败

其中链接属性是我定义的类型。

Where link and attribute are types I defined.

COPY命令是.cql脚本的一部分,该脚本在Docker容器内运行,是其设置过程的一部分。

The COPY commands were apart of a .cql script that was being run inside a Docker container as apart of it's setup process.

我在一些人们看到此P的地方阅读过Windows上的icklingError(似乎与NTFS有关),但在这种情况下,Docker容器使用的是Alpine Linux。

I read in a few places where people were seeing this PicklingError on Windows (seemed to be related to NTFS), but the Docker container in this case was using Alpine Linux.

解决方法是将这些选项添加到末尾。我的COPY命令:

The fix was to add these options to the end of my COPY commands:

其中MINBATCHSIZE = 1 AND MAXBATCHSIZE = 1 AND PAGESIZE = 10;

http:// docs .datastax.com / en / cql / 3.3 / cql / cql_reference / cqlshCopy.html

我没有看到运行包含COPY的.cql脚本的PicklingError

I was not seeing the PicklingError running these .cql scripts containing COPY commands locally, so it seems to be an issue that only rears it's head in a low memory situation.

相关问题:

  • Pickling Error running COPY command: CQLShell on Windows
  • Cassandra multiprocessing can't pickle _thread.lock objects

这篇关于使用cqlsh复制非常大的Cassandra表时出现PicklingError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆