使用cqlsh复制非常大的Cassandra表时出现PicklingError [英] PicklingError when copying a very large cassandra table using cqlsh
问题描述
当我尝试使用命令将表复制到cassandra时:
When I try to copy a table to cassandra using the command:
copy images from 'images.csv'
我收到错误:
'PicklingError: Can't pickle <class 'cqlshlib.copyutil.ImmutableDict'>: attribute lookup cqlshlib.copyutil.ImmutableDict failed'
我已经成功导入了所有其他表,但是该表不起作用。唯一的区别是它包含用于图像的大型二进制斑点。
I have successfully imported all of my other tables, but this one is not working. The only difference with this one is that it contains large binary blobs for images.
这是csv文件中的示例行:
Here is a sample row from the csv file:
b267ba01-5420-4be5-b962-7e563dc245b0,,0x89504e...[large binary blob]...426082,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,pentium_e6600,01fa819e-3425-47ca-82aa-a3eec319a998,0,7e700538-cce3-495f-bfd2-6a4fa968bdf6,,,png,0
这是导致错误的文件:
https://www.dropbox.com/s/5mrl6nuwelpf3lz/images.csv?dl=0
And here is the file that causes the error: https://www.dropbox.com/s/5mrl6nuwelpf3lz/images.csv?dl=0
这里是我的架构:
CREATE TABLE dealtech.images (
id uuid PRIMARY KEY,
attributes map<text, text>,
data blob,
height int,
item_id uuid,
name text,
product_id uuid,
scale double,
seller_id uuid,
text_bottom int,
text_top int,
type text,
width int
)
表已导出使用 cassandra 2.x
,而我目前正在使用 cassandra 3.0.9
导入它们。
The tables were exported using cassandra 2.x
and I am currently using cassandra 3.0.9
to import them.
推荐答案
尽管我的数据集很小(一个表中有46行,另一表中有262行),但我还是遇到了apache cassandra 3.9的同一问题。
I ran into this same issue with apache cassandra 3.9, although my datasets were fairly small (46 rows in one table, 262 rows in another table).
PicklingError:无法腌制< class'cqlshlib.copyutil.link'> ;:属性查找cqlshlib.copyutil.link失败
PicklingError:无法腌制< class'cqlshlib.copyutil.attribute'> ;:属性查找cqlshlib.copyutil .attribute失败
其中链接
和属性
是我定义的类型。
Where link
and attribute
are types I defined.
COPY命令是.cql脚本的一部分,该脚本在Docker容器内运行,是其设置过程的一部分。
The COPY commands were apart of a .cql script that was being run inside a Docker container as apart of it's setup process.
我在一些人们看到此P的地方阅读过Windows上的icklingError(似乎与NTFS有关),但在这种情况下,Docker容器使用的是Alpine Linux。
I read in a few places where people were seeing this PicklingError on Windows (seemed to be related to NTFS), but the Docker container in this case was using Alpine Linux.
解决方法是将这些选项添加到末尾。我的COPY命令:
The fix was to add these options to the end of my COPY commands:
其中MINBATCHSIZE = 1 AND MAXBATCHSIZE = 1 AND PAGESIZE = 10;
http:// docs .datastax.com / en / cql / 3.3 / cql / cql_reference / cqlshCopy.html
我没有看到运行包含COPY的.cql脚本的PicklingError
I was not seeing the PicklingError running these .cql scripts containing COPY commands locally, so it seems to be an issue that only rears it's head in a low memory situation.
相关问题:
- Pickling Error running COPY command: CQLShell on Windows
- Cassandra multiprocessing can't pickle _thread.lock objects
这篇关于使用cqlsh复制非常大的Cassandra表时出现PicklingError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!