轻松应用:AttributeError:'DataFrame'对象没有属性'name' [英] dask apply: AttributeError: 'DataFrame' object has no attribute 'name'
问题描述
我有一个params数据框并将一个函数应用于每一行.此函数本质上是几个sql_queries和对结果的简单计算.
I have a dataframe of params and apply a function to each row. this function is essentially a couple of sql_queries and simple calculations on the result.
我正在尝试利用Dask的多处理功能,同时保持结构和〜接口.下面的示例确实有效,并且确实有很大帮助:
I am trying to leverage Dask's multiprocessing while keeping structure and ~ interface. The example below works and indeed has a significant boost:
def get_metrics(row):
record = {'areaName': row['name'],
'areaType': row.area_type,
'borough': row.Borough,
'fullDate': row['start'],
'yearMonth': row['start'],
}
Q = Qsi.format(unittypes=At,
start_date=row['start'],
end_date=row['end'],
freq='Q',
area_ids=row['descendent_ids'])
sales = _get_DF(Q)
record['salesInventory'] = len(sales)
record['medianAskingPrice'] = sales.price.median()
R.append(record)
R = []
x = ddf.map_partition(lambda x: x.apply(_metric, axis=1), meta={'result': None})
x.compute()
result2 = pd.DataFrame(R)
但是,当我尝试改用 .apply
方法(见下文)时,它使我'DataFrame'对象没有属性'name'
...
However, when I try to use .apply
method instead (see below), it throws me 'DataFrame' object has no attribute 'name'
...
R = list()
y = ddf.apply(_metrics, axis=1, meta={'result': None})
但是,ddf.head()显示数据框中有一个 name
列
Yet, ddf.head() shows that there is a name
column in the dataframe
推荐答案
如果您的 _metric
函数的输出是一个Series,也许您应该使用 meta =('您的系列的列名','输出的dtype')
If the output of your _metric
function is a Series, maybe you should use meta=('your series's columns name','output's dtype')
这对我有用.
这篇关于轻松应用:AttributeError:'DataFrame'对象没有属性'name'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!