列出文件中的unicode单词 [英] Make list of unicode words that are in a file
问题描述
我的代码是
f = codecs.open(r'C:\Users\Admin\Desktop\nepali.txt', 'r', 'UTF-8')
nepali = f.read().split()
for i in nepali:
print i
显示文件中的单词:
यो
किताब
टेबुल
मा
छ
यो
एक
किताब
हो
केटा
但是当我尝试用代码创建单词列表时:
But when I try to create a list of the words with code:
file=codecs.open(r'C:\Users\Admin\Desktop\nepali.txt', 'r', 'UTF-8')
nepali = list(file.read().split())
print nepali
现在输出显示如下
[u'\ufeff\u092f\u094b', u'\u0915\u093f\u0924\u093e\u092c', u'\u091f\u0947\u092c\u0941\u0932', u'\u092e\u093e', u'\u091b', u'\u092f\u094b', u'\u090f\u0915', u'\u0915\u093f\u0924\u093e\u092c', u'\u0939\u094b',]
输出应如下所示:
[यो, किताब, टेबुल, मा, छ,यो, एक, किताब, हो]
推荐答案
您正在查看 repr()
函数,该函数始终用于显示容器的内容.输出用于调试,而不是最终用户的显示.任何不可打印的非ASCII代码点都由转义序列表示(根据代码点,转义序列可以是单个字符转义,例如\t
或\n
,也可以使用2、4或8个十六进制数字,例如\xe5
,\u2603
或\U0001f4e2
).
You are looking at the output of the repr()
function, which is always used for displaying the contents of containers. The output is meant for debugging, not end-user displays; any non-printable non-ASCII codepoint is represented by an escape sequence (which can, depending on the codepoint, be a single character escape like \t
or \n
, or use 2, 4, or 8 hex digits, like \xe5
, \u2603
or \U0001f4e2
).
您必须手动生成输出:
print u'[{}]'.format(u', '.join(nepali))
这将生成一个unicode字符串,其格式设置为看起来像列表对象,但不使用repr()
,只需在字符串周围添加方括号并与', '
(逗号和空格)连接即可.
This produces a unicode string formatted to look like a list object, but without using repr()
, simply by adding square brackets around the strings, joined with ', '
(comma and space).
演示:
>>> nepali = [u'\ufeff\u092f\u094b', u'\u0915\u093f\u0924\u093e\u092c', u'\u091f\u0947\u092c\u0941\u0932', u'\u092e\u093e', u'\u091b', u'\u092f\u094b', u'\u090f\u0915', u'\u0915\u093f\u0924\u093e\u092c', u'\u0939\u094b',]
>>> print u'[{}]'.format(u', '.join(nepali))
[यो, किताब, टेबुल, मा, छ, यो, एक, किताब, हो]
但是,如果要向最终用户显示此内容,为什么要完全使用方括号?
However, if you want to show this to an end-user, why use the square brackets at all?
这篇关于列出文件中的unicode单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!