无法使用.astype(int)方法将pandas数据框列转换为int变量类型 [英] Unable to convert pandas dataframe column to int variable type using .astype(int) method
问题描述
我正在遍历数据帧的行以提取值,如下所示,但是我收到的始终是浮点值,并且我无法将两个 result ["YEAR_TORONTO"]都转换为int
和 result ["YEAR_TORONTO2"]
I'm iterating through rows of a dataframe to extract values as follows but what I receive is always a float value and I'm not able to convert to int for both result["YEAR_TORONTO"]
and result["YEAR_TORONTO2"]
for i in range(0, len(result)):
if result["SOURCE_DATASET"].iloc[i] == "toronto":
result["YEAR_TORONTO"].iloc[i] = pd.to_datetime(result["START_DATE"].iloc[i]).year
result["YEAR_TORONTO"].iloc[i].astype(int) if not np.isnan(result["YEAR_TORONTO"].iloc[i]) else np.nan
result["YEAR_TORONTO2"].iloc[i] = result["YEAR_TORONTO"].iloc[i]
关于为什么会这样的任何想法?尝试了多种方法,包括 pd.to_numeric
和 round()
,但尽管采用了该方法,但还是没有运气
Any idea as for why this could be? Tried multiple approaches including pd.to_numeric
and round()
but no luck despite the method
有趣的是,当我输出 result ["YEAR_TORONTO"].iloc [1] .astype(int)如果不是np.isnan(result ["YEAR_TORONTO"].iloc [i])否则为np.nan
,我将 2016
作为一个整数,但是一旦我通过调用 result
输出整个数据帧,我仍然会得到 2016.0
作为浮点数
Interestingly enough, when I output
result["YEAR_TORONTO"].iloc[1].astype(int) if not np.isnan(result["YEAR_TORONTO"].iloc[i]) else np.nan
,
I get 2016
as an int, but once I output the entire dataframe by calling result
, I still get 2016.0
as a float
样本数据(输入):
SOURCE_DATASET START_DATE
0 brampton 06-04-16
1 toronto 06-04-16
2 brampton 06-04-16
3 toronto 06-04-99
样本数据(输出):
SOURCE_DATASET START_DATE YEAR_TORONTO YEAR_TORONTO2
0 brampton 06-04-16 NaN NaN
1 toronto 06-04-16 2016.0 2016.0
2 brampton 06-04-16 NaN NaN
3 toronto 06-04-99 1999.0 1999.0
只需尝试使用 np.where
并获得相同的结果.
Just tried with np.where
as well and getting the same result.
推荐答案
您使用 astype()
的方法是正确的,但如果您的列中包含 nan
,则该方法确实有效.您可以尝试先分裂
Your approach to use astype()
is right but it does work if you column contain nan
. You could try to first split
result["YEAR_TORONTO"].astype(str).str.split('.', expand=True)[0].tolist()
然后分开然后从那里拿走.
And then separate then take it from there.
或者
Result.loc[RESULT['TORONTO'].notnull(), 'x'] = result.loc[result['TORONTO'].notnull(), 'x'].apply(int)
这篇关于无法使用.astype(int)方法将pandas数据框列转换为int变量类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!