无法使用dask删除列或切片数据帧? [英] Can't drop columns or slice dataframe using dask?

查看:76
本文介绍了无法使用dask删除列或切片数据帧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用dask而不是pandas,因为我有2.6gb的csv文件。
我加载了它,想删除一列。但似乎尚未实施drop方法
df.drop(‘column’)或切片df [:,:-1]

I am trying to use dask instead of pandas since I have 2.6gb csv file. I load it and I want to drop a column. but it seems that neither the drop method df.drop('column') or slicing df[ : , :-1]

。是这种情况还是我只是想念一些东西?

is implemented yet. Is this the case or am I just missing something ?

推荐答案

我们在drop 方法://github.com/ContinuumIO/dask/pull/546 rel = noreferrer>此PR 。从dask 0.7.0起可用。

We implemented the drop method in this PR. This is available as of dask 0.7.0.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 2, 1]})

In [3]: import dask.dataframe as dd

In [4]: ddf = dd.from_pandas(df, npartitions=2)

In [5]: ddf.drop('y', axis=1).compute()
Out[5]: 
   x
0  1
1  2
2  3

以前,还可以使用带有列名的切片;当然,如果您有很多列,这可能就没有那么吸引人了。

Previously one could also have used slicing with column names; though of course this can be less attractive if you have many columns.

In [6]: ddf[['x']].compute()
Out[6]: 
   x
0  1
1  2
2  3

这篇关于无法使用dask删除列或切片数据帧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆