从Pandas数据框中的特定行创建新列 [英] Create new column from specific rows in pandas dataframe
本文介绍了从Pandas数据框中的特定行创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个csv文件,其中每一行代表一个属性,然后是可变数量的后续行,这些行反映了该属性中的房间.我想为每个属性创建一列,以汇总每个房间的总建筑面积.数据的非结构化性质使得很难在pandas
中实现.这是我目前拥有的表格的示例:
I have a csv file where each row represents a property followed by a variable number of subsequent rows that reflect rooms in the property. I want to create a column that, for each property, summates the gross floor area of each room. The unstructured nature of the data is making this difficult to achieve in pandas
. Here is an example of the table I have at the moment:
id ba store_desc floor_area
0 1 Toy Shop NaN
1 2 Retail Zone A 29.42
2 2 Retail Zone B 31.29
3 1 Grocery Store NaN
4 2 Retail Zone A 68.00
5 2 Outside Garden 83.50
6 2 Office 7.30
这是我要创建的表:
id ba store_desc floor_area gross_floor_area
0 1 Toy Shop NaN 60.71
3 1 Grocery Store NaN 158.8
有人对如何实现此结果有任何指示吗?我完全迷路了.
Does anybody have any pointers on how to achieve this result? I'm totally lost.
山姆
推荐答案
IIUC
df1=df[df['floor_area'].isnull()]
df1['gross_floor_area']=df.groupby(df['floor_area'].isnull().cumsum())['floor_area'].sum().values
df1
Out[463]:
id ba store_desc floor_area gross_floor_area
0 0 1 ToyShop NaN 60.71
3 3 1 GroceryStore NaN 158.80
这篇关于从Pandas数据框中的特定行创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文