在 Pandas 中解析 csv 文件时,如何从字符串中删除多余的空格? [英] How can I remove extra whitespace from strings when parsing a csv file in Pandas?
本文介绍了在 Pandas 中解析 csv 文件时,如何从字符串中删除多余的空格?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下名为data.csv"的文件:
I have the following file named 'data.csv':
1997,Ford,E350
1997, Ford , E350
1997,Ford,E350,"Super, luxurious truck"
1997,Ford,E350,"Super ""luxurious"" truck"
1997,Ford,E350," Super luxurious truck "
"1997",Ford,E350
1997,Ford,E350
2000,Mercury,Cougar
我想把它解析成一个pandas DataFrame,让DataFrame看起来如下:
And I would like to parse it into a pandas DataFrame so that the DataFrame looks as follows:
Year Make Model Description
0 1997 Ford E350 None
1 1997 Ford E350 None
2 1997 Ford E350 Super, luxurious truck
3 1997 Ford E350 Super "luxurious" truck
4 1997 Ford E350 Super luxurious truck
5 1997 Ford E350 None
6 1997 Ford E350 None
7 2000 Mercury Cougar None
我能做的最好的是:
pd.read_table("data.csv", sep=r',', names=["Year", "Make", "Model", "Description"])
这让我觉得:
Year Make Model Description
0 1997 Ford E350 None
1 1997 Ford E350 None
2 1997 Ford E350 Super, luxurious truck
3 1997 Ford E350 Super "luxurious" truck
4 1997 Ford E350 Super luxurious truck
5 1997 Ford E350 None
6 1997 Ford E350 None
7 2000 Mercury Cougar None
如何获得没有这些空格的 DataFrame?
How can I get the DataFrame without those whitespaces?
推荐答案
你可以使用转换器:
import pandas as pd
def strip(text):
try:
return text.strip()
except AttributeError:
return text
def make_int(text):
return int(text.strip('" '))
table = pd.read_table("data.csv", sep=r',',
names=["Year", "Make", "Model", "Description"],
converters = {'Description' : strip,
'Model' : strip,
'Make' : strip,
'Year' : make_int})
print(table)
收益
Year Make Model Description
0 1997 Ford E350 None
1 1997 Ford E350 None
2 1997 Ford E350 Super, luxurious truck
3 1997 Ford E350 Super "luxurious" truck
4 1997 Ford E350 Super luxurious truck
5 1997 Ford E350 None
6 1997 Ford E350 None
7 2000 Mercury Cougar None
这篇关于在 Pandas 中解析 csv 文件时,如何从字符串中删除多余的空格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文