自定义 pandas read_csv中的分隔符 [英] Customizing the separator in pandas read_csv

查看:407
本文介绍了自定义 pandas read_csv中的分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将许多不同的数据文件读入各种pandas数据帧.这些数据文件中的列用空格分隔.但是,对于每个文件,空格的数量是不同的(对于某些文件,只有一个空格,对于其他文件,则有两个空格,依此类推).因此,每次导入文件时,我都必须手动转到该文件并查看已使用的空格数,并在sep中提供这些空格数:

I am reading many different data files into various pandas dataframes. The columns in these datafiles are separated by spaces. However, for each file, the number of spaces is different (for some of them, there is only one space, for others, there are two spaces and so on). Thus, every time I import the file, I have to manually go to that file and see the number of spaces that have been used and give those many number of spaces in sep:

import pandas as pd
df = pd.read_csv('myfile.dat', sep = '    ')

有什么办法可以让熊猫假设任意数量的空格"作为分隔符?另外,有什么方法可以告诉熊猫使用制表符(\t)或空格作为分隔符吗?

Is there any way I can tell pandas to assume "any number of spaces" as the separator? Also, is there any way I can tell pandas to use either tab (\t) or spaces as the separator?

推荐答案

是的,您可以使用像sep='\s+'这样的简单正则表达式来表示一个或多个空格.

Yes, you can use a simple regular expression like sep='\s+' to denote one or more spaces.

这篇关于自定义 pandas read_csv中的分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆