Python-如何使用NUL分隔线读取文件? [英] Python - how to read file with NUL delimited lines?

查看:171
本文介绍了Python-如何使用NUL分隔线读取文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通常使用以下Python代码从文件中读取行:

I usually use the following Python code to read lines from a file :

f = open('./my.csv', 'r')
for line in f:
    print line

但是,如果文件用"\ 0"(而不是"\ n")分隔行怎么办?有没有可以处理此问题的Python模块?

But how about if the file is line delimited by "\0" (not "\n") ? Is there a Python module that could handle this ?

谢谢您的建议.

推荐答案

如果文件足够小,可以将其全部读取到内存中,则可以使用split:

If your file is small enough that you can read it all into memory you can use split:

for line in f.read().split('\0'):
    print line

否则,您可能想从有关此功能请求的讨论中尝试此食谱:

Otherwise you might want to try this recipe from the discussion about this feature request:

def fileLineIter(inputFile,
                 inputNewline="\n",
                 outputNewline=None,
                 readSize=8192):
   """Like the normal file iter but you can set what string indicates newline.
   
   The newline string can be arbitrarily long; it need not be restricted to a
   single character. You can also set the read size and control whether or not
   the newline string is left on the end of the iterated lines.  Setting
   newline to '\0' is particularly good for use with an input file created with
   something like "os.popen('find -print0')".
   """
   if outputNewline is None: outputNewline = inputNewline
   partialLine = ''
   while True:
       charsJustRead = inputFile.read(readSize)
       if not charsJustRead: break
       partialLine += charsJustRead
       lines = partialLine.split(inputNewline)
       partialLine = lines.pop()
       for line in lines: yield line + outputNewline
   if partialLine: yield partialLine


我还注意到您的文件具有"csv"扩大. Python内置了一个CSV模块(导入csv).有一个名为 Dialect.lineterminator 的属性,但当前不是在阅读器中实现:


I also noticed your file has a "csv" extension. There is a CSV module built into Python (import csv). There is an attribute called Dialect.lineterminator however it is currently not implemented in the reader:

方言终结者

用于终止编写器产生的行的字符串.默认为"\ r \ n".

The string used to terminate lines produced by the writer. It defaults to '\r\n'.

注意:阅读器经过硬编码,可以将'\ r'或'\ n'识别为行尾,并忽略换行符.这种行为将来可能会改变.

Note The reader is hard-coded to recognise either '\r' or '\n' as end-of-line, and ignores lineterminator. This behavior may change in the future.

这篇关于Python-如何使用NUL分隔线读取文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆