python:使用.iterrows()创建列 [英] python: using .iterrows() to create columns
问题描述
我正在尝试使用循环函数来创建一个矩阵,该矩阵确定在特定的一周内是否有商品出现.
I am trying to use a loop function to create a matrix of whether a product was seen in a particular week.
df(代表产品)中的每一行都有一个close_date(产品关闭日期)和week_diff(列出产品的周数).
Each row in the df (representing a product) has a close_date (the date the product closed) and a week_diff (the number of weeks the product was listed).
import pandas
mydata = [{'subid' : 'A', 'Close_date_wk': 25, 'week_diff':3},
{'subid' : 'B', 'Close_date_wk': 26, 'week_diff':2},
{'subid' : 'C', 'Close_date_wk': 27, 'week_diff':2},]
df = pandas.DataFrame(mydata)
我的目标是查看每个date_range中每种产品列出了多少种替代产品
My goal is to see how many alternative products were listed for each product in each date_range
我设置了以下循环:
for index, row in df.iterrows():
i = 0
max_range = row['Close_date_wk']
min_range = int(row['Close_date_wk'] - row['week_diff'])
for i in range(min_range,max_range):
col_head = 'job_week_' + str(i)
row[col_head] = 1
能帮您解释一下为什么"row [col_head] = 1"行既未添加列,又未为该行添加值.
Can you please help explain why the "row[col_head] = 1" line is neither adding a column, nor adding a value to that column for that row.
例如,如果:
row A has date range 1,2,3
row B has date range 2,3
row C has date range 3,4,5'
那么理想情况下,我想结束
then ideally I would like to end up with
row A has 0 alternative products in week 1
1 alternative products in week 2
2 alternative products in week 3
row B has 1 alternative products in week 2
2 alternative products in week 3
&c..
推荐答案
您不能在此处使用row
更改df来添加新列,您可以引用原始df或使用.loc
, .iloc
或.ix
,例如:
You can't mutate the df using row
here to add a new column, you'd either refer to the original df or use .loc
, .iloc
, or .ix
, example:
In [29]:
df = pd.DataFrame(columns=list('abc'), data = np.random.randn(5,3))
df
Out[29]:
a b c
0 -1.525011 0.778190 -1.010391
1 0.619824 0.790439 -0.692568
2 1.272323 1.620728 0.192169
3 0.193523 0.070921 1.067544
4 0.057110 -1.007442 1.706704
In [30]:
for index,row in df.iterrows():
df.loc[index,'d'] = np.random.randint(0, 10)
df
Out[30]:
a b c d
0 -1.525011 0.778190 -1.010391 9
1 0.619824 0.790439 -0.692568 9
2 1.272323 1.620728 0.192169 1
3 0.193523 0.070921 1.067544 0
4 0.057110 -1.007442 1.706704 9
您可以修改现有行:
In [31]:
# reset the df by slicing
df = df[list('abc')]
for index,row in df.iterrows():
row['b'] = np.random.randint(0, 10)
df
Out[31]:
a b c
0 -1.525011 8 -1.010391
1 0.619824 2 -0.692568
2 1.272323 8 0.192169
3 0.193523 2 1.067544
4 0.057110 3 1.706704
但是使用行添加新列将行不通:
But adding a new column using row won't work:
In [35]:
df = df[list('abc')]
for index,row in df.iterrows():
row['d'] = np.random.randint(0,10)
df
Out[35]:
a b c
0 -1.525011 8 -1.010391
1 0.619824 2 -0.692568
2 1.272323 8 0.192169
3 0.193523 2 1.067544
4 0.057110 3 1.706704
这篇关于python:使用.iterrows()创建列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!