从Numpy结果分配考拉列 [英] Assign Koalas Column from Numpy Result

查看:70
本文介绍了从Numpy结果分配考拉列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试在Databricks-Koalas中复制熊猫功能在熊猫中:

Trying to replicate Pandas Functionality in Databricks-Koalas In Pandas:

df = pd.DataFrame({'a': [450, 1, 26],
                   'b': [1, 450, 70],
                  })
thresh = [x for x in range(26)] # create a list 1 to 25
df["c"] = np.where((df.a.isin(thresh) | df.b.isin(thresh)), 1, 0) # find the values within the threshold and flag column 'c'
df
# returns
Out[32]: 
     a    b  c
0  450    1  1
1    1  450  1
2   26   70  0

在考拉中:

df = ks.DataFrame({'a': [450, 1, 26],
                   'b': [1, 450, 70],
                  })

thresh = [x for x in range(26)] # create a list 1 to 25
df = df.assign(c=np.where((df.a.isin(thresh) | df.b.isin(thresh)), 1, 0)) # find the values within the threshold and flag column 'c'
# returns
PandasNotImplementedError: The method `pd.Series.__iter__()` is not implemented. If you want to collect your data as an NumPy array, use 'to_numpy()' instead.

如何按预期正确使用 to_numpy 或将Numpy结果包装在ks.Series()中,以便assign()接受结果?

How do I properly use to_numpy as it is expecting or wrap the Numpy result in a ks.Series() so that the assign() will take the result?

df = df.assign(c = ks.Series(np.where((df.a.isin(thresh)| df.b.isin(thresh)),1,0))))出现与上述相同的错误.

df = df.assign(c=ks.Series(np.where((df.a.isin(thresh) | df.b.isin(thresh)), 1, 0))) gives the same error as above.

有没有办法在考拉中复制熊猫功能?

Is there a way to replicate the pandas functionality in the koalas?

推荐答案

要在 ks.DataFrame 中执行此操作,则不需要 np.where ,但您可以使用 astype :

To perform the operation you do here in a ks.DataFrame, you don't need np.where, but you could use astype:

df = df.assign(c= (df.a.isin(thresh) | df.b.isin(thresh)).astype(int) )
df
     a    b  c
0  450    1  1
1    1  450  1
2   26   70  0

这篇关于从Numpy结果分配考拉列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆