Python将\r识别为行分隔符 [英] Python recognizing \r as a line delimiter

查看:230
本文介绍了Python将\r识别为行分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Ubuntu盒子上运行的Python脚本中使用 import fileinput

我在命令行上运行脚本,其中包含 python myscript.py firstinputfile.txt secondinputfile.txt 和内部 myscript.py 我使用作为fileinput.input()中的行来迭代线。我遇到的问题是 firstinputfile.txt secondinputfile.txt 都使用Macintosh( \ r )行结尾,并且 fileinput.input()似乎不能识别 \r 作为行分隔符。有什么办法强制 fileinput 来识别 \r 作为行分隔符?



我已经考虑过预处理 firstinputfile.txt secondinputfile.txt 使用 \\\
行结束,但是犹豫有两个原因:i)我真的不想发出额外的文件来管理和ii)我仍然希望输入 fileinput 来自文件参数(而不是 stdin 在管道命令之后),所以我可以使用 fileinput.filename() fileinput.filelineno()



有什么建议?

解决方案

事实证明 fileinput.input()支持可选的 openhook 参数:

lockquote>

你可以通过
openhook参数来控制打开文件的方式来打开文件input.input()或FileInput()。钩子
必须是一个带有两个参数,文件名和模式的函数,
返回一个相应打开的类文件对象。这个模块已经提供了两个有用的钩子

另外,通用换行支持文档建议可以打开一个文件来支持Windows / Unix / Macintosh换行符<$


使用'U'或'rU'模式打开文件以通用换行模式打开读取
的文件。所有三行结束约定将被
转换为由诸如read()和readline()等各种文件
方法返回的字符串中的\ n。

所以,你可以编写一个小函数作为 openhook 参数传递,这个参数将以一种方式打开文件它支持通用的换行符:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ def $ univ_file_read(name,mode)
警告忽略传递给这个模式参数函数
return open(name,'rU')

然后,对于在fileinput.input()中的行,

 

使用:

pre $ 用于fileinput.input(openhook = univ_file_read)中的行:

这对我来说似乎已经成功了, \r 现在被识别为行分隔符。


I'm using import fileinput in a Python script running on an Ubuntu box.

I'm running the script on the command line with something along the lines of python myscript.py firstinputfile.txt secondinputfile.txt and inside myscript.py I am using for line in fileinput.input() to iterate over the lines. The problem I'm running into is that firstinputfile.txt and secondinputfile.txt both use Macintosh (\r) line endings, and fileinput.input() does not seem to be recognizing \r as a line delimiter.

Is there any way to force fileinput to recognize \r as a line delimiter?

I've considered preprocessing firstinputfile.txt and secondinputfile.txt to use \n line endings, but am hesitant for two reasons: i) I don't really want to emit additional files to manage and ii) I still want the input to fileinput to come from file arguments (not stdin after piping commands) so I can use fileinput.filename() and fileinput.filelineno().

Any suggestions?

解决方案

It turns out fileinput.input() supports an optional openhook parameter:

You can control how files are opened by providing an opening hook via the openhook parameter to fileinput.input() or FileInput(). The hook must be a function that takes two arguments, filename and mode, and returns an accordingly opened file-like object. Two useful hooks are already provided by this module.

Furthermore, the universal newline support document suggests that a file can be open to support Windows/Unix/Macintosh newlines with the rU mode:

Opening a file with the mode 'U' or 'rU' will open a file for reading in universal newline mode. All three line ending conventions will be translated to a "\n" in the strings returned by the various file methods such as read() and readline().

So, you can write a little function to pass as the openhook argument that will open the file in a manner which supports universal newlines:

def univ_file_read(name, mode):
    # WARNING: ignores mode argument passed to this function
    return open(name, 'rU')

Then, instead of:

for line in fileinput.input():

Use:

for line in fileinput.input(openhook=univ_file_read):

This seems to have done the trick for me, and \r is being recognized as a line delimiter now.

这篇关于Python将\r识别为行分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆