从Pandas数据框中的特定行创建新列 [英] Create new column from specific rows in pandas dataframe

查看:147
本文介绍了从Pandas数据框中的特定行创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个csv文件,其中每一行代表一个属性,然后是可变数量的后续行,这些行反映了该属性中的房间.我想为每个属性创建一列,以汇总每个房间的总建筑面积.数据的非结构化性质使得很难在pandas中实现.这是我目前拥有的表格的示例:

I have a csv file where each row represents a property followed by a variable number of subsequent rows that reflect rooms in the property. I want to create a column that, for each property, summates the gross floor area of each room. The unstructured nature of the data is making this difficult to achieve in pandas. Here is an example of the table I have at the moment:

id  ba  store_desc      floor_area
0   1   Toy Shop        NaN
1   2   Retail Zone A   29.42
2   2   Retail Zone B   31.29
3   1   Grocery Store   NaN
4   2   Retail Zone A   68.00
5   2   Outside Garden  83.50
6   2   Office          7.30

这是我要创建的表:

id  ba  store_desc      floor_area   gross_floor_area
0   1   Toy Shop        NaN          60.71
3   1   Grocery Store   NaN          158.8

有人对如何实现此结果有任何指示吗?我完全迷路了.

Does anybody have any pointers on how to achieve this result? I'm totally lost.

山姆

推荐答案

IIUC

df1=df[df['floor_area'].isnull()]

df1['gross_floor_area']=df.groupby(df['floor_area'].isnull().cumsum())['floor_area'].sum().values

df1
Out[463]: 
   id  ba    store_desc  floor_area  gross_floor_area
0   0   1       ToyShop         NaN             60.71
3   3   1  GroceryStore         NaN            158.80

这篇关于从Pandas数据框中的特定行创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆