pandas 修剪领先&数据帧中的尾随空白 [英] Pandas trim leading & trailing white space in a dataframe
问题描述
开发一种功能,该功能可修剪引导&尾随空白.
develop a function that Trims leading & trailing white space.
这是一个简单的示例,但是实际文件包含的行和列要复杂得多.
here is a simple sample, but real file contains far more complex rows and columns.
df=pd.DataFrame([["A b ",2,3],[np.nan,2,3],\
[" random",43,4],[" any txt is possible "," 2 1",22],\
["",23,99],[" help ",23,np.nan]],columns=['A','B','C'])
结果应消除所有前导&尾随空白,但保留文本之间的空间.
the result should eliminate all leading & trailing white space, but retain the space inbetween the text.
df=pd.DataFrame([["A b",2,3],[np.nan,2,3],\
["random",43,4],["any txt is possible","2 1",22],\
["",23,99],["help",23,np.nan]],columns=['A','B','C'])
请注意,该功能需要涵盖所有可能的情况. 谢谢
Mind that the function needs to cover all possible situations. thank you
推荐答案
我认为需要检查值是否为字符串,因为列中的混合值-数字与字符串以及每个字符串调用strip
:
I think need check if values are strings, because mixed values in column - numeric with strings and for each string call strip
:
df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
print (df)
A B C
0 A b 2 3.0
1 NaN 2 3.0
2 random 43 4.0
3 any txt is possible 2 1 22.0
4 23 99.0
5 help 23 NaN
如果列具有相同的dtype,则对于列B
中的数字值,不会像示例中那样获得NaN
:
If columns have same dtypes, not get NaN
s like in your sample for numeric values in column B
:
cols = df.select_dtypes(['object']).columns
df[cols] = df[cols].apply(lambda x: x.str.strip())
print (df)
A B C
0 A b NaN 3.0
1 NaN NaN 3.0
2 random NaN 4.0
3 any txt is possible 2 1 22.0
4 NaN 99.0
5 help NaN NaN
这篇关于 pandas 修剪领先&数据帧中的尾随空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!