Python:Unicode源文件在字符之间添加空格(实际上是空字节) [英] Python: Unicode source file adds spaces (actually null bytes) between characters

查看:320
本文介绍了Python:Unicode源文件在字符之间添加空格(实际上是空字节)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新手。但是,我设法从txt文件(unicode)中提取一些行,并将它们写入另一个文件。

  lines = InFile.readlines()
OutFile.writelines(lines [3:])

它正在工作,但(我相信)由于编码问题,输出文件中每个字符之间添加了一个空格。
结果示例:

  2 0 1 3  -  1 2  -  2 3; ; 3 6 0。 3 7 
2 0 1 3 - 1 2 - 2 4; ; 0。 0 0

源文件中的行:

  2013-12-23 ;; 360.37 
2013-12-24 ;; $ 0.00

如果在运行脚本之前将txt源文件保存为ANSI,我将收到正确的结果。然而,由于源文件由另一个软件自动传送为Unicode,所以每次手动更改都是不实际的。我读了很多其他的编码/编码/解码问题。但是我完全迷失了,不知道如何解决这个问题。哪个是正确的命令?脚本在哪个地方?或者我完全错了,与编码问题没有任何关系?

解决方案

我相当肯定您的输入文件是UTF-16编码的,您看到的空格实际上是空字节。



尝试

  with open(myfile.txt,r,encoding =utf-16)as infile:
lines = infile.readlines()

,看看问题是否仍然存在。


I am a newbie. However, I managed to extract some lines from a txt-file (unicode) and write them in another file.

lines = InFile.readlines()
OutFile.writelines(lines[3:])

It is working but (I believe) due to a coding issue there is a space added between each character in the output file. Example of a result:

2 0 1 3 - 1 2 - 2 3 ; ; 3 6 0 . 3 7 
2 0 1 3 - 1 2 - 2 4 ; ; 0 . 0 0 

Lines in the source file:

2013-12-23;;360.37
2013-12-24;;0.00

If I save the txt source file as ANSI before running the script, I receive the correct results. However, as the source file is delivered automatically as Unicode by another software, it is not practical to change that every time manually. I read through a lot of other coding/encoding/decoding questions. But I am completely lost and don't know how I can fix that issue. Which is the correct command? At which place in the script? Or am I completely wrong and it doesn't have anything to do with a coding issue?

解决方案

I'm fairly certain that your input file is UTF-16 encoded, and the spaces you're seeing are actually null bytes.

Try

with open("myfile.txt", "r", encoding="utf-16") as infile:
    lines = infile.readlines()

and see if the problem persists.

这篇关于Python:Unicode源文件在字符之间添加空格(实际上是空字节)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆