Python Dask-2个DataFrame的垂直串联 [英] Python Dask - vertical concatenation of 2 DataFrames

查看：286 发布时间：2020/10/7 19:18:14 python-2.7 dataframe concat dask

本文介绍了Python Dask-2个DataFrame的垂直串联的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试垂直连接两个Dask DataFrame

I am trying to vertically concatenate two Dask DataFrames

我有以下Dask DataFrame：

I have the following Dask DataFrame:

d = [
    ['A','B','C','D','E','F'],
    [1, 4, 8, 1, 3, 5],
    [6, 6, 2, 2, 0, 0],
    [9, 4, 5, 0, 6, 35],
    [0, 1, 7, 10, 9, 4],
    [0, 7, 2, 6, 1, 2]
    ]
df = pd.DataFrame(d[1:], columns=d[0])
ddf = dd.from_pandas(df, npartitions=5)

以下是作为Pandas DataFrame的数据

Here is the data as a Pandas DataFrame

          A         B      C      D      E      F
0         1         4      8      1      3      5
1         6         6      2      2      0      0
2         9         4      5      0      6     35
3         0         1      7     10      9      4
4         0         7      2      6      1      2

这里是Dask数据框

Dask DataFrame Structure:
                   A      B      C      D      E      F
npartitions=4                                          
0              int64  int64  int64  int64  int64  int64
1                ...    ...    ...    ...    ...    ...
2                ...    ...    ...    ...    ...    ...
3                ...    ...    ...    ...    ...    ...
4                ...    ...    ...    ...    ...    ...
Dask Name: from_pandas, 4 tasks

我正在尝试垂直连接2个Dask DataFrame：

I am trying to concatenate 2 Dask DataFrames vertically:

ddf_i = ddf + 11.5
dd.concat([ddf,ddf_i],axis=0)

但我收到此错误：

Traceback (most recent call last):
      ...
      File "...", line 572, in concat
        raise ValueError('All inputs have known divisions which cannot '
    ValueError: All inputs have known divisions which cannot be concatenated
    in order. Specify interleave_partitions=True to ignore order

但是，如果我尝试：

dd.concat([ddf,ddf_i],axis=0,interleave_partitions=True)

然后它似乎正在工作。将此设置为 True 是否存在问题（就性能而言-速度）？还是有另外一种垂直2个串联Dask DataFrame的方法？

then it appears to be working. Is there a problem with setting this to True (in terms of performance - speed)? Or is there another way to vertically 2 concatenate Dask DataFrames?

Python Dask-2个DataFrame的垂直串联 [英] Python Dask - vertical concatenation of 2 DataFrames

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Python Dask-2个DataFrame的垂直串联 [英] Python Dask - vertical concatenation of 2 DataFrames

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭