python unicode相等比较在终端中失败,但在Spyder编辑器下工作 [英] Python unicode equal comparison failed in terminal but working under Spyder editor

查看:147
本文介绍了python unicode相等比较在终端中失败,但在Spyder编辑器下工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将utf-8文件中的unicode字符串与Python脚本中定义的常量进行比较.

I need to compare a unicode string coming from a utf-8 file with a constant defined in the Python script.

我正在Linux上使用Python 2.7.6.

I'm using Python 2.7.6 on Linux.

如果我在Spyder(Python编辑器)中运行上述脚本,则它可以运行,但是如果我从终端调用Python脚本,则测试失败.在调用脚本之前,我是否需要在终端中导入/定义某些内容?

If I run the above script within Spyder (a Python editor) I got it working, but if I invoke the Python script from a terminal, I got the test failing. Do I need to import/define something in the terminal before invoking the script?

脚本("pythonscript.py"):

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import csv

some_french_deps = []
idata_raw = csv.DictReader(open("utf8_encoded_data.csv", 'rb'), delimiter=";")
for rec in idata_raw:
    depname = unicode(rec['DEP'],'utf-8')
    some_french_deps.append(depname)

test1 = "Tarn"
test2 = "Rhône-Alpes"
if test1==some_french_deps[0]:
  print "Tarn test passed"
else:
  print "Tarn test failed"
if test2==some_french_deps[2]:
  print "Rhône-Alpes test passed"
else:
  print "Rhône-Alpes test failed"

utf8_encoded_data.csv:

DEP
Tarn
Lozère
Rhône-Alpes
Aude

从Spyder编辑器运行输出:

Tarn test passed
Rhône-Alpes test passed

从终端运行输出:

$ ./pythonscript.py 
Tarn test passed
./pythonscript.py:20: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  if test2==some_french_deps[2]:
Rhône-Alpes test failed

推荐答案

您正在将字节字符串(类型为str)与unicode值进行比较. Spyder已将默认编码从ASCII更改为 ,转换为UTF-8,Python在比较这两种类型时在字节字符串和unicode值之间进行了隐式转换.您的字节字符串被​​编码为UTF-8,因此在Spyder下,比较成功.

You are comparing a byte string (type str) with a unicode value. Spyder has changed the default encoding from ASCII to UTF-8, and Python does an implicit conversion between byte strings and unicode values when comparing the two types. Your byte strings are encoded to UTF-8, so under Spyder that comparison succeeds.

解决方案是使用字节字符串,而将unicode文字用作您的两个测试值:

The solution is to not use byte strings, use unicode literals for your two test values instead:

test1 = u"Tarn"
test2 = u"Rhône-Alpes"

在我看来,更改系统默认编码是一个糟糕的主意.您的代码应正确使用Unicode而不是依赖隐式转换,但是更改隐式转换的规则只会增加混乱,而不会使任务变得更简单.

Changing the system default encoding is, in my opinion, a terrible idea. Your code should use Unicode correctly instead of relying on implicit conversions, but to change the rules of implicit conversions only increases the confusion, not make the task any easier.

这篇关于python unicode相等比较在终端中失败,但在Spyder编辑器下工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆