Pandas - 去除空白 [英] Pandas - Strip white space

查看:59
本文介绍了Pandas - 去除空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 python csvkit 来比较 2 个这样的文件:

I am using python csvkit to compare 2 files like this:

df1 = pd.read_csv('input1.csv', sep=',s+', delimiter=',', encoding="utf-8")
df2 = pd.read_csv('input2.csv', sep=',s,', delimiter=',', encoding="utf-8")
df3 = pd.merge(df1,df2, on='employee_id', how='right')
df3.to_csv('output.csv', encoding='utf-8', index=False)

目前我正在通过一个脚本运行该文件,该脚本从 employee_id 列中去除空格.

Currently I am running the file through a script before hand that strips spaces from the employee_id column.

employee_ids 的一个例子:

37 78973 3
23787
2 22 3
123

有没有办法让 csvkit 做到这一点并为我节省一步?

Is there a way to get csvkit to do it and save me a step?

推荐答案

您可以使用 .str.strip():

You can strip() an entire Series in Pandas using .str.strip():

df1['employee_id'] = df1['employee_id'].str.strip()
df2['employee_id'] = df2['employee_id'].str.strip()

这将删除 df1df2

或者,您可以修改 read_csv 行以也使用 skipinitialspace=True

Alternatively, you can modify your read_csv lines to also use skipinitialspace=True

df1 = pd.read_csv('input1.csv', sep=',s+', delimiter=',', encoding="utf-8", skipinitialspace=True)
df2 = pd.read_csv('input2.csv', sep=',s,', delimiter=',', encoding="utf-8", skipinitialspace=True)

<小时>

您似乎正在尝试删除包含数字的字符串中的空格.您可以通过以下方式执行此操作:


It looks like you are attempting to remove spaces in a string containing numbers. You can do this by:

df1['employee_id'] = df1['employee_id'].str.replace(" ","")
df2['employee_id'] = df2['employee_id'].str.replace(" ","")

这篇关于Pandas - 去除空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆