创建一个空的 Pandas DataFrame,然后填充它? [英] Creating an empty Pandas DataFrame, then filling it?
问题描述
我从这里的 Pandas DataFrame 文档开始:http://pandas.pydata.org/pandas-docs/stable/dsintro.html
I'm starting from the pandas DataFrame docs here: http://pandas.pydata.org/pandas-docs/stable/dsintro.html
我想用时间序列类型的计算中的值迭代地填充 DataFrame.所以基本上,我想用 A 列、B 列和时间戳行初始化 DataFrame,全部为 0 或全部为 NaN.
I'd like to iteratively fill the DataFrame with values in a time series kind of calculation. So basically, I'd like to initialize the DataFrame with columns A, B and timestamp rows, all 0 or all NaN.
然后我会添加初始值并检查这些数据,计算前一行的新行,比如 row[A][t] = row[A][t-1]+1
左右.
I'd then add initial values and go over this data calculating the new row from the row before, say row[A][t] = row[A][t-1]+1
or so.
我目前正在使用下面的代码,但我觉得它有点难看,必须有一种方法可以直接使用 DataFrame 来做到这一点,或者一般来说只是一种更好的方法.注意:我使用的是 Python 2.7.
I'm currently using the code as below, but I feel it's kind of ugly and there must be a way to do this with a DataFrame directly, or just a better way in general. Note: I'm using Python 2.7.
import datetime as dt
import pandas as pd
import scipy as s
if __name__ == '__main__':
base = dt.datetime.today().date()
dates = [ base - dt.timedelta(days=x) for x in range(0,10) ]
dates.sort()
valdict = {}
symbols = ['A','B', 'C']
for symb in symbols:
valdict[symb] = pd.Series( s.zeros( len(dates)), dates )
for thedate in dates:
if thedate > dates[0]:
for symb in valdict:
valdict[symb][thedate] = 1+valdict[symb][thedate - dt.timedelta(days=1)]
print valdict
推荐答案
这里有几个建议:
使用 date_range
用于索引:
Use date_range
for the index:
import datetime
import pandas as pd
import numpy as np
todays_date = datetime.datetime.now().date()
index = pd.date_range(todays_date-datetime.timedelta(10), periods=10, freq='D')
columns = ['A','B', 'C']
注意:我们可以简单地通过编写:
df_ = pd.DataFrame(index=index, columns=columns)
df_ = df_.fillna(0) # with 0s rather than NaNs
要对数据进行这些类型的计算,请使用 numpy 数组:
To do these type of calculations for the data, use a numpy array:
data = np.array([np.arange(10)]*3).T
因此我们可以创建 DataFrame:
Hence we can create the DataFrame:
In [10]: df = pd.DataFrame(data, index=index, columns=columns)
In [11]: df
Out[11]:
A B C
2012-11-29 0 0 0
2012-11-30 1 1 1
2012-12-01 2 2 2
2012-12-02 3 3 3
2012-12-03 4 4 4
2012-12-04 5 5 5
2012-12-05 6 6 6
2012-12-06 7 7 7
2012-12-07 8 8 8
2012-12-08 9 9 9
这篇关于创建一个空的 Pandas DataFrame,然后填充它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!