Python 两个相同的字符串被视为不同 [英] Python Two Identical Strings are viewed as Different

查看:78
本文介绍了Python 两个相同的字符串被视为不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个字符串,所有迹象看起来都一样:

x1 = 'NC Soft - NCSOFT_Guild Wars 2013 年 12 月 2 日 :: BNLX_AD_Parallax_160x600'x2 = 'NC Soft - NCSOFT_Guild Wars 2013 年 12 月 2 日 :: BNLX_CT_Parallax_160X600'

但是,检查相等性表明它们不是.

在 [312] 中:如果 x1 != x2:.....:打印是的".....:是的

我还尝试从命令提示符中复制两个字符串,然后将它们作为新变量粘贴回,但它们仍然不相等.我 80% 肯定这是因为它们以一种奇怪的方式编码,插入了一些我看不到的奇怪字符,但是使用 type() 都只显示为字符串.

有什么办法可以看到真实"的字符串吗?任何帮助表示赞赏.

解决方案

它们不一样;使用 difflib.ndiff()非常清楚地显示了这两个值的不同之处:

<预><代码>>>>导入差异库>>>打印 '\n'.join(difflib.ndiff([x1], [x2]))- N C Soft - NCSOFT_Guild Wars 2013 年 12 月 2 日 :: BNLX_AD_Parallax_160x600?^^ ^+ N C Soft - NCSOFT_Guild Wars 2013 年 12 月 2 日 :: BNLX_CT_Parallax_160X600?^^ ^

一般来说,如有疑问,请使用 repr() 查看表示.Python 2 将对字符串中的任何不可打印或非 ASCII 字符使用转义符,任何有趣"的字符都会像拇指酸痛一样突出.在 Python 3 中,使用 ascii() 函数 对于与 repr() 相同的结果,不那么保守,Unicode 充斥着乍一看相同的字符组合.

对于仍然看不到两者之间有什么变化的字符串,上面的difflib工具也可以帮助指出究竟发生了什么变化.

I have two strings that by all indication look identical:

x1 = 'N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_AD_Parallax_160x600'
x2 = 'N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_CT_Parallax_160X600'

However, checking for equality shows they are not.

In [312]: if x1 != x2:
   .....:     print 'yep'
   .....:
yep

I also tried copying both strings out of command prompt and them pasting them back in as a new variables but they are still not equal. I'm 80% sure it's because they're encoded in a weird way, with some odd characters inserted that I can't see, but using type() both just show up as string.

Is there any way I can see the "real" string? Any help is appreciated.

解决方案

They are not the same; using difflib.ndiff() shows how these two values differ very clearly:

>>> import difflib
>>> print '\n'.join(difflib.ndiff([x1], [x2]))
- N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_AD_Parallax_160x600
?                                                      ^^             ^

+ N C Soft - NCSOFT_Guild Wars 2 December 2013 :: BNLX_CT_Parallax_160X600
?                                                      ^^             ^

In general, when in doubt use repr() to look at the representation. Python 2 will use escapes for any non-printable or non-ASCII character in the string, any 'funny' characters will stand out like a sore thumb. In Python 3, use the ascii() function for the same result as repr() there is less conservative and Unicode is rife with character combinations that look the same at first glance.

For strings where you still cannot see what changes between the two, the above difflib tool can also help point out what exactly changed.

这篇关于Python 两个相同的字符串被视为不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆