将正则表达式传递给python的csv模块或numpy的genfromtxt/loadtxt中的定界符字段? [英] Pass regex to delimiter field in python's csv module or numpy's genfromtxt / loadtxt?

查看:49
本文介绍了将正则表达式传递给python的csv模块或numpy的genfromtxt/loadtxt中的定界符字段?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对表格中的数据进行了一些奇怪的限定(即,用逗号分隔的值组,用制表符将其与其他值分隔开):

I have tabulated data with some strange delimination (i.e. groups of values separated by commas, seperated from other values by tabs):

A,345,567   56  67  test

是否可以通过以下任何一种干净巧妙的方法来处理多个定界符:csv模块 numpy.genfromtxt ,还是 numpy.loadtxt ?

Is there a clean and clever way of handling multiple delimiters in any of the following: csv module, numpy.genfromtxt, or numpy.loadtxt?

我找到了方法例如,但我希望那里有更好的解决方案.理想情况下,我想使用genfromtxt和正则表达式作为分隔符.

I have found methods such as this, but I'm hoping there is a better solution out there. Ideally I'd like to use a genfromtxt and a regex for the delimiter.

推荐答案

在您要求的三个软件包中,恐怕答案是 no .但是,您可以执行 replace('\ t',',')(或相反).例如:

I’m afraid the answer is no in the three packages you asked for. However, you can just do replace('\t', ',') (or the reverse). For example:

from StringIO import StringIO # py3k: from io import StringIO
import csv
with open('./file') as fh:
    io = StringIO(fh.read().replace('\t', ','))

reader = csv.reader(io)

for row in reader:
    print(row)

这篇关于将正则表达式传递给python的csv模块或numpy的genfromtxt/loadtxt中的定界符字段?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆