包含双引号字符的python字符串 [英] python string including double quote character

查看:347
本文介绍了包含双引号字符的python字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我输入的字符串由字符组成,包括双引号和单引号和'

I have input strings that are comprised of characters, including double and single quotes " and '

B@SS$*JU(PQ
AD&^%$^@!$
%()%@@DDSFD"*")(#
ABD*E@(%J^&@

但是,当我从文本文件中打开以上输入并仅打印它时,在第三行打印为\xe2\x80\x9d

however, when I open the above input from a text file and just print it, the double quotes " in the third line get printed as \xe2\x80\x9d

我的目标是进行简单的字符计数:

I am aiming to do a simple character count:

B 2
@ 3
S 2
$ 3
etc.

所以我希望能够输出

" 3

。我应该用双引号代替某些东西,这样我就可以计算它们了,

in the above list. Should I replace the double quotes with something so I can count them and print out the count?

非常感谢。

推荐答案


\xe2\x80\x9d

\xe2\x80\x9d

是特殊双引号的unicode值。可以从UTF-8解码为Unicode以转换为conv将其设置为单个 Unicode字符。

Is a unicode value for "special" double quotes. You could decode from UTF-8 into Unicode to convert this into a "single" Unicode character.

>>> print "\xe2\x80\x9d".decode("utf-8")
"
>>> len("\xe2\x80\x9d".decode("utf-8"))
1

如果您使用的是Python 3:

If you are using Python 3:

>>> print(b"\xe2\x80\x9d".decode('utf8'))
"
>>> len(b"\xe2\x80\x9d".decode("utf-8"))
1

因此对于要计数的文件(在Python 2中):

So for your file that you are counting (in Python 2):

from collections import defaultdict
with open("filename", 'r') as f:
    for text in f:
        decoded = text.decode("utf-8")
        count = defaultdict(int)
        for i in decoded:
            count[i] += 1

这篇关于包含双引号字符的python字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆