在Python未编码的字符串中查找和替换两种引号样式 [英] Find and replace both quotation styles in Python unicoded string
问题描述
我正在尝试在Python字符串中替换用双引号样式("..."和"...")标记的字符串.
I'm trying to replace strings marked in both quotation mark styles ("..." and "...") on a string in Python.
我已经写了一个正则表达式来替换标准报价
I've already written a regex to replace the standard quotations
print re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
当我尝试为文学作品(?)做这件事时,它什么也不会取代.
When I try to do it for the literary (?) ones it doesn't replace anything.
return re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
事实上,就目前而言,我什至无法进行条件查询:
In fact, as I have it right now, I can't even make a conditional query:
quote_list = ['"', '"']
if all(character in self.title for character in quote_list):
print "It has literary quotes"
print re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
print re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
编辑:更多内容:这是一个对象
EDIT: Further context: It's an object
class Entry(models.Model):
title = models.CharField(max_length=200)
def render_title(self):
"""
This function wraps italics around quotation marks
"""
quote_list = ['"', '"']
if all(character in self.title for character in quote_list):
print "It has literary quotes"
return re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
return re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', self.title)
我不熟悉regex命令.我在做什么错了?
I am not well-versed in regex commands. What am I doing wrong?
EDIT2 :距离问题更近了一步!这是因为我正在处理未编码的字符串.我仍然为解决这个问题而感到困惑.感谢您的帮助!
EDIT2: One step closer to the problem! It lies with the fact that I'm dealing with unicoded strings. I'm still stumped as how I can solve this. Any help is appreciated!
>>> title = u"sdsfgsdfgsdgfsdgs " asd" asd"
>>> print re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', title)
sdsfgsdfgsdgfsdgs " asd" asd
>>> title = "sdsfgsdfgsdgfsdgs " asd" asd"
>>> print re.sub(r'\"(.+?)\"', r'<em>"\1"</em>', title)
sdsfgsdfgsdgfsdgs <em>" asd"</em> asd
推荐答案
我终于找到了答案.按照@interjay的建议打印变量后,我发现该字符串未编码.
I finally found an answer. After printing the variable as suggested by @interjay I found out that the string was unicoded.
与简单的字符串进行比较无法正常工作,因此我删除了条件,并使用了 answer 即可简单地制作一个转义为Unicode的正则表达式字符串,以处理简单和文学"引号.
Comparing it with a simple string didn't work so I removed the conditional and used this answer to simply make an unicode-escaped regex string to handle both simple and "literary" quotes.
title = re.sub(ur'\"(.+?)\"', ur'"<em>\1</em>"', self.title) # notice the ur
title = re.sub(ur'\"(.+?)\"', ur'"<em>\1</em>"', title)
我在此处的评论中(不幸的是现在已删除)看到了如何将以上两个句子合二为一,但目前仍然有效.
I've seen here in a comment (unfortunately now deleted) how one could merge the above two sentences in one, but for now it works.
非常感谢您的帮助!
这篇关于在Python未编码的字符串中查找和替换两种引号样式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!