快捷项目分配.无法使用loc进行项目分配 [英] Dask item assignment. Cannot use loc for item assignment
问题描述
我有一个无法容纳在内存中的镶木地板文件文件夹,因此我正在使用dask
来执行数据清理操作.我有一个要执行项目分配的功能,但似乎找不到任何在线解决方案可以作为该特定功能的解决方案.以下是在熊猫中起作用的功能.如何在dask数据框中获得相同的结果?我以为延迟可能会有所帮助,但是我尝试编写的所有解决方案都没有起作用.
I have a folder of parquet files that I can't fit in memory so I am using dask
to perform the data cleansing operations. I have a function where I want to perform item assignment but I can't seem to find any solutions online that qualify as solutions to this particular function. Below is the function that works in pandas. How do I get the same results in a dask dataframe? I thought delayed might help but all of the solutions I've tried to write haven't been working.
def item_assignment(df):
new_col = np.bitwise_and(df['OtherCol'], 0b110)
df['NewCol'] = 0
df.loc[new_col == 0b010, 'NewCol'] = 1
df.loc[new_col == 0b100, 'NewCol'] = -1
return df
TypeError: '_LocIndexer' object does not support item assignment
推荐答案
You can replace your loc
assignments with dask.dataframe.Series.mask
:
df['NewCol'] = 0
df['NewCol'] = df['NewCol'].mask(new_col == 0b010, 1)
df['NewCol'] = df['NewCol'].mask(new_col == 0b100, -1)
这篇关于快捷项目分配.无法使用loc进行项目分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!