Python正则表达式反拉丁1字符编码？ [英] Python regex against Latin-1 character encoding?

查看：486 发布时间：2016/11/19 15:48:16 python encoding utf-8 character-encoding

本文介绍了Python正则表达式反拉丁1字符编码？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含（我相信）拉丁-1编码的文件。

I have a file which contains (I believe) latin-1 encoding.

但是，我不能匹配正则表达式对这个文件。

However, I cannot match regexes against this file.

如果我的猫文件，它看起来不错：

If I cat the file, it looks fine:

但是，我找不到字符串：

However, I cannot find the string:

In [12]: txt = open("b").read()

In [13]: print txt
  <Vw_IncidentPipeline_Report>


In [14]: txt
Out[14]: '\x00 \x00 \x00<\x00V\x00w\x00_\x00I\x00n\x00c\x00i\x00d\x00e\x00n\x00t\x00P\x00i\x00p\x00e\x00l\x00i\x00n\x00e\x00_\x00R\x00e\x00p\x00o\x00r\x00t\x00>\x00\r\x00\n'

In [22]: txt.find("Vw_IncidentPipeline_Report")
Out[22]: -1

In [23]: txt.decode("latin-1")
Out[23]: u'\x00 \x00 \x00<\x00V\x00w\x00_\x00I\x00n\x00c\x00i\x00d\x00e\x00n\x00t\x00P\x00i\x00p\x00e\x00l\x00i\x00n\x00e\x00_\x00R\x00e\x00p\x00o\x00r\x00t\x00>\x00\r\x00\n'

In [25]: txt.decode("utf-16le")
Out[25]: u'\u2000\u2000\u3c00\u5600\u7700\u5f00\u4900\u6e00\u6300\u6900\u6400\u6500\u6e00\u7400\u5000\u6900\u7000\u6500\u6c00\u6900\u6e00\u6500\u5f00\u5200\u6500\u7000\u6f00\u7200\u7400\u3e00\u0d00\u0a00'

如何成功解码字符串，以便在其中找到字符串？

How do I successfully decode the string, so I can find strings within it?

推荐答案

这不是拉丁语-1，它是utf-16大端序：

It's not Latin-1, it's utf-16 big endian:

>>> txt = '\x00 \x00 \x00<\x00V\x00w\x00_\x00I\x00n\x00c\x00i\x00d\x00e\x00n\x00t\x00P\x00i\x00p\x00e\x00l\x00i\x00n\x00e\x00_\x00R\x00e\x00p\x00o\x00r\x00t\x00>\x00\r\x00\n'
>>> txt.decode("utf-16be")
u'  <Vw_IncidentPipeline_Report>\r\n'

b $ b

所以，只是解码那种方式，生活幸福之后; - ）。

so, just decode that way and live happily ever after;-).

这篇关于Python正则表达式反拉丁1字符编码？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python正则表达式反拉丁1字符编码？ [英] Python regex against Latin-1 character encoding?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python正则表达式反拉丁1字符编码？ [英] Python regex against Latin-1 character encoding?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭