Python 2.7中的Open（）和codecs.open（）的行为有着奇怪的不同 [英] Open() and codecs.open() in Python 2.7 behave strangely different

查看：805 发布时间：2017/11/4 21:11:04 python python-2.7 file-io codec python-unicode

本文介绍了Python 2.7中的Open（）和codecs.open（）的行为有着奇怪的不同的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个文本文件的第一行unicode字符和所有其他行在ASCII。
我尝试将第一行读作一个变量，将所有其他行作为另一个读取。但是，当我使用下面的代码：

 ＃ -  *  -  coding：utf-8  -  *  -  
 import codecs 
 import os 
 filename ='1.txt'
f = codecs.open（filename，'r3'，encoding ='utf-8'）
 print f 
）names_f = f.readline（）。split（''）
 data_f = f.readlines（）
 print len（names_f）
 print len（data_f）
 f.close （）
 print'现在完全不同：'
g = open（filename，'r'）
 names_g = g.readline（）。split（''）
 print g 
 data_g = g.readlines（）
 print len（names_g）
 print len（data_g）
 g.close（）
  
 
 我得到以下输出： 
 
 
 < ;打开文件'1.txt'，模式'rb'在0x01235230> 
 28 
 
 7 
 
现在对于完全不同的东西：
 
 <打开文件'1.txt'，mode'r 'at 0x017875A0> 
 
 28 
 
 77 
  
如果我不不使用readlines（），整个文件读取，不仅在codecs.open（）和open（）中的前7行。
 
 
为什么会发生这种情况？ 
为什么codecs.open（）以二进制模式读取文件，尽管加了'r'参数？ 
 
 
 Upd：这是原始文件： a href =http://www1.datafilehost.com/d/0792d687 =nofollow> http://www1.datafilehost.com/d/0792d687  
 
 <因为您使用 .readline（） 第一个，所以 codecs.open（）文件已经填充了线缓冲区;后续对 .readlines（）的调用只返回缓冲行。
 
 
 如果你再次调用 .readlines（） ，剩下的行会被返回： 
 
 
 >>> f = codecs.open（filename，'r3'，encoding ='utf-8'）
>>> line = f.readline（）
>>> len（f.readlines（））
 7 
>>> len（f.readlines（））
 71 
  
解决方法是不要混合 .readline（）和 .readlines（）： 
 
 
 data_f = f.readlines（）
 names_f = data_f.pop （0）.split（''）＃取第一行。 
  
这种行为真的是一个错误; Python的开发人员都知道，请参阅问题8260 。
 
 
另一个选项是使用  io .open（） 而不是 codecs.open（）; Python 3使用 io 库来实现内置的 open（）函数，并且更强大并且比编解码器模块多功能。
 
I have a text file with first line of unicode characters and all other lines in ASCII.
I try to read the first line as one variable, and all other lines as another. However, when I use the following code:
# -*- coding: utf-8 -*-
import codecs
import os
filename = '1.txt'
f = codecs.open(filename, 'r3', encoding='utf-8')
print f
names_f = f.readline().split(' ')
data_f = f.readlines()
print len(names_f)
print len(data_f)
f.close()
print 'And now for something completely differerent:'
g = open(filename, 'r')
names_g = g.readline().split(' ')
print g
data_g = g.readlines()
print len(names_g)
print len(data_g)
g.close()
I get the following output:
<open file '1.txt', mode 'rb' at 0x01235230>
28

7

And now for something completely differerent:

<open file '1.txt', mode 'r' at 0x017875A0>

28

77
If I don't use readlines(), whole file reads, not only first 7 lines both at codecs.open() and open().

Why does such thing happen?
And why does codecs.open() read file in binary mode, despite the 'r' parameter is added?

Upd: This is original file: http://www1.datafilehost.com/d/0792d687
 解决方案 
Because you used .readline() first, the codecs.open() file has filled a linebuffer; the subsequent call to .readlines() returns only the buffered lines.

If you call .readlines() again, the rest of the lines are returned:
>>> f = codecs.open(filename, 'r3', encoding='utf-8')
>>> line = f.readline()
>>> len(f.readlines())
7
>>> len(f.readlines())
71
The work-around is to not mix .readline() and .readlines():
f = codecs.open(filename, 'r3', encoding='utf-8')
data_f = f.readlines()
names_f = data_f.pop(0).split(' ')  # take the first line.
This behaviour is really a bug; the Python devs are aware of it, see issue 8260.

The other option is to use io.open() instead of codecs.open(); the io library is what Python 3 uses to implement the built-in open() function and is a lot more robust and versatile than the codecs module.

                        这篇关于Python 2.7中的Open（）和codecs.open（）的行为有着奇怪的不同的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python 2.7中的Open（）和codecs.open（）的行为有着奇怪的不同 [英] Open() and codecs.open() in Python 2.7 behave strangely different

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python 2.7中的Open（）和codecs.open（）的行为有着奇怪的不同 [英] Open() and codecs.open() in Python 2.7 behave strangely different

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭