Python 字符串打印为 [u'String'] [英] Python string prints as [u'String']

查看:34
本文介绍了Python 字符串打印为 [u'String']的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这肯定会很容易,但它真的让我很烦恼.

This will surely be an easy one but it is really bugging me.

我有一个脚本可以读取网页并使用 Beautiful Soup 来解析它.我从 soup 中提取了所有链接,因为我的最终目标是打印出 link.contents.

I have a script that reads in a webpage and uses Beautiful Soup to parse it. From the soup I extract all the links as my final goal is to print out the link.contents.

我解析的所有文本都是 ASCII.我知道 Python 将字符串视为 unicode,而且我确信这非常方便,只是在我的 wee 脚本中没有用.

All of the text that I am parsing is ASCII. I know that Python treats strings as unicode, and I am sure this is very handy, just of no use in my wee script.

每次我去打印一个包含 'String' 的变量时,我都会将 [u'String'] 打印到屏幕上.是否有一种简单的方法可以将其恢复为 ascii,或者我应该编写一个正则表达式来去除它?

Every time I go to print out a variable that holds 'String' I get [u'String'] printed to the screen. Is there a simple way of getting this back into just ascii or should I write a regex to strip it?

推荐答案

[u'ABC'] 将是 unicode 字符串的单元素列表.Beautiful Soup 总是产生 Unicode.因此,您需要将列表转换为单个 unicode 字符串,然后将其转换为 ASCII.

[u'ABC'] would be a one-element list of unicode strings. Beautiful Soup always produces Unicode. So you need to convert the list to a single unicode string, and then convert that to ASCII.

我不知道你是如何得到单元素列表的;内容成员将是一个字符串和标签列表,这显然不是你所拥有的.假设您真的总是得到一个包含单个元素的列表,并且您的测试确实只有 ASCII,您将使用这个:

I don't know exaxtly how you got the one-element lists; the contents member would be a list of strings and tags, which is apparently not what you have. Assuming that you really always get a list with a single element, and that your test is really only ASCII you would use this:

 soup[0].encode("ascii")

但是,请仔细检查您的数据是否真的是 ASCII.这是非常罕见的.更有可能是 latin-1 或 utf-8.

However, please double-check that your data is really ASCII. This is pretty rare. Much more likely it's latin-1 or utf-8.

 soup[0].encode("latin-1")


 soup[0].encode("utf-8")

或者你问 Beautiful Soup 原来的编码是什么,然后把它恢复成这个编码:

Or you ask Beautiful Soup what the original encoding was and get it back in this encoding:

 soup[0].encode(soup.originalEncoding)

这篇关于Python 字符串打印为 [u'String']的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆