pandas 修剪领先&数据帧中的尾随空白 [英] Pandas trim leading & trailing white space in a dataframe

查看:57
本文介绍了 pandas 修剪领先&数据帧中的尾随空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

开发一种功能,该功能可修剪引导&尾随空白.

develop a function that Trims leading & trailing white space.

这是一个简单的示例,但是实际文件包含的行和列要复杂得多.

here is a simple sample, but real file contains far more complex rows and columns.

df=pd.DataFrame([["A b ",2,3],[np.nan,2,3],\
[" random",43,4],[" any txt is possible "," 2 1",22],\
["",23,99],[" help ",23,np.nan]],columns=['A','B','C'])

结果应消除所有前导&尾随空白,但保留文本之间的空间.

the result should eliminate all leading & trailing white space, but retain the space inbetween the text.

df=pd.DataFrame([["A b",2,3],[np.nan,2,3],\
["random",43,4],["any txt is possible","2 1",22],\
["",23,99],["help",23,np.nan]],columns=['A','B','C'])

请注意,该功能需要涵盖所有可能的情况. 谢谢

Mind that the function needs to cover all possible situations. thank you

推荐答案

我认为需要检查值是否为字符串,因为列中的混合值-数字与字符串以及每个字符串调用strip:

I think need check if values are strings, because mixed values in column - numeric with strings and for each string call strip:

df = df.applymap(lambda x: x.strip() if isinstance(x, str) else x)
print (df)
                     A    B     C
0                  A b    2   3.0
1                  NaN    2   3.0
2               random   43   4.0
3  any txt is possible  2 1  22.0
4                        23  99.0
5                 help   23   NaN

如果列具有相同的dtype,则对于列B中的数字值,不会像示例中那样获得NaN:

If columns have same dtypes, not get NaNs like in your sample for numeric values in column B:

cols = df.select_dtypes(['object']).columns
df[cols] = df[cols].apply(lambda x: x.str.strip())
print (df)
                     A    B     C
0                  A b  NaN   3.0
1                  NaN  NaN   3.0
2               random  NaN   4.0
3  any txt is possible  2 1  22.0
4                       NaN  99.0
5                 help  NaN   NaN

这篇关于 pandas 修剪领先&数据帧中的尾随空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆