包含双引号字符的python字符串 [英] python string including double quote character
问题描述
我输入的字符串由字符组成,包括双引号和单引号和'
I have input strings that are comprised of characters, including double and single quotes " and '
B@SS$*JU(PQ
AD&^%$^@!$
%()%@@DDSFD"*")(#
ABD*E@(%J^&@
但是,当我从文本文件中打开以上输入并仅打印它时,在第三行打印为\xe2\x80\x9d
however, when I open the above input from a text file and just print it, the double quotes " in the third line get printed as \xe2\x80\x9d
我的目标是进行简单的字符计数:
I am aiming to do a simple character count:
B 2
@ 3
S 2
$ 3
etc.
所以我希望能够输出
" 3
。我应该用双引号代替某些东西,这样我就可以计算它们了,
in the above list. Should I replace the double quotes with something so I can count them and print out the count?
非常感谢。
推荐答案
\xe2\x80\x9d
\xe2\x80\x9d
是特殊双引号的unicode值。可以从UTF-8解码为Unicode以转换为conv将其设置为单个 Unicode字符。
Is a unicode value for "special" double quotes. You could decode from UTF-8 into Unicode to convert this into a "single" Unicode character.
>>> print "\xe2\x80\x9d".decode("utf-8")
"
>>> len("\xe2\x80\x9d".decode("utf-8"))
1
如果您使用的是Python 3:
If you are using Python 3:
>>> print(b"\xe2\x80\x9d".decode('utf8'))
"
>>> len(b"\xe2\x80\x9d".decode("utf-8"))
1
因此对于要计数的文件(在Python 2中):
So for your file that you are counting (in Python 2):
from collections import defaultdict
with open("filename", 'r') as f:
for text in f:
decoded = text.decode("utf-8")
count = defaultdict(int)
for i in decoded:
count[i] += 1
这篇关于包含双引号字符的python字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!