重采样 pandas 数据框正在删除列 [英] Resampling pandas dataframe is deleting column
问题描述
Val ts year doy interpolat region_id
2000-02-18 NaN 950832000 2000 49 NaN 19987
2000-03-05 NaN 952214400 2000 65 NaN 19987
2000-03-21 NaN 953596800 2000 81 NaN 19987
2000-04-06 0.402539365 954979200 2000 97 NaN 19987
2000-04-22 0.54021746 956361600 2000 113 NaN 19987
以上数据框具有日期时间索引.我这样重新采样:
The above dataframe has a datetime index. I resample it like so:
df = df.resample('D')
但是,这种重新采样会导致以下数据帧:
However, this resampling results in this dataframe:
ts year doy interpolat region_id
2000-01-01 1199180160 2008 1 1 19990
2000-01-02 NaN NaN NaN NaN NaN
2000-01-03 NaN NaN NaN NaN NaN
2000-01-04 NaN NaN NaN NaN NaN
2000-01-05 NaN NaN NaN NaN NaN
"Val"列为什么消失了?其他所有列似乎也弄乱了.请参阅对熊猫数据框中的缺失行进行线性插值,以了解在何处数据帧来自.
Why did the 'Val' column disappear? and all the other columns seem messed up too. See Linearly interpolate missing rows in pandas dataframe for an explanation of where the dataframe is coming from.
-编辑 基于@unutbu的问题:
--EDIT Based on @unutbu's questions:
df.reset_index().to_dict('list')
{'index': [Timestamp('2000-02-18 00:00:00'), Timestamp('2000-03-05 00:00:00'), Timestamp('2000-03-21 00:00:00'), ... '0.670709965', '0.631584375', '0.562112815', '0.50740686', '0.4447712', '0.47880806', nan, nan]}
-上面数据框的csv文件全文在这里:
-- The csv file for the above data frame in its entirety is here:
https://www.dropbox.com/s/dp76hk6yfs6c1og /test.csv?dl=0
推荐答案
出于某些原因,Val
列可能没有数字dtype,并且在resample
.
The Val
columns will probably not have a numerical dtype for some reason, and all non-numerical (eg object
dtype) columns are removed in resample
.
要检查,只需查看df.info()
.
要将其转换为数字列,可以使用astype(float)
或convert_objects
(从v0.17开始的pd.to_numeric
).
To check, just look at df.info()
.
To convert it to a numerical columns, you can use astype(float)
or the convert_objects
(pd.to_numeric
starting from v0.17).
这篇关于重采样 pandas 数据框正在删除列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!