Python字符串中特殊字符存储不一致 [英] Python Inconsistent Special Character Storage In String

查看：176 发布时间：2020/7/12 18:48:55 python python-3.x string python-3.7 unicode-normalization

本文介绍了Python字符串中特殊字符存储不一致的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

版本为Python 3.7.我刚刚发现python有时会将字符ñ存储在具有多种表示形式的字符串中，而对于为什么或如何处理它，我完全不知所措.

Version is Python 3.7. I've just found out python sometimes will store the character ñ in a string with multiple representations and I'm completely at a loss as to why or how to deal with it.

我不确定显示此问题的最佳方法，所以我将仅显示一些代码输出.

I'm not sure the best way to show this issue, so I'm just going to show some code output.

我有两个字符串s1和s2都设置为相等的'Dan Peña'

I have two strings, s1 and s2 both set to equal 'Dan Peña'

它们都是字符串类型.

我可以运行代码:

print(s1 == s2) # prints false
print(len(s1)) # prints 8
print(len(s2)) # prints 9
print(type(s1)) # print 'str'
print(type(s2)) # print 'str'
for i in range(len(s1)):
    print(s1[i] + ", " + s2[i])

循环的输出为:

D, D
a, a
n, n
 ,  
P, P
e, e
ñ, n
a, ~

那么，是否有任何python方法来处理这些不一致问题，或者至少有一些关于python什么时候使用哪种表示形式的规范?

So, are there any python methods for dealing with these inconsistencies, or at least some specification as to when python will use which representation?

很高兴知道Python为什么会选择以这种方式实现.

It would also be nice to know why Python would choose to implement this way.

正在从Django数据库中检索一个字符串，而另一个字符串则是从解析列表目录调用中的文件名获得的字符串中.

One string is being retrieved from a django database and the other string is from a string obtained from parsing a filename from a list dir call.

from app.models import Model
from django.core.management.base import BaseCommand

class Command(BaseCommand):

    def handle(self, *args, **kwargs):
        load_dir = "load_dir_name"
        save_dir = "save_dir"

        files = listdir(load_dir)
        save_file_map = {file[:file.index("_thumbnail.jpg")]: f"{save_dir}/{file}" for file in files}
        for obj in Model.objects.all():
            s1 = obj.title
            save_file_path = save_file_map[s1] # Key error when encountering ñ.

但是，当我搜索save_file_map字典时，发现与s1完全相同的键，除了ñ编码为字符n~而不是字符ñ.

However, when I search through the save_file_map dict I find a key that is exactly the same as s1 except the ñ is encoded as characters n~ rather than character ñ.

请注意，我在上面的代码中使用list dir加载的文件首先是基于obj.title字段命名的，因此应确保该名称的文件位于load_dir目录中./p>

Note that the files I load in the above code with list dir are named base on the obj.title field in the first place, so it should be guaranteed that a file with the name is in the load_dir directory.

Python字符串中特殊字符存储不一致 [英] Python Inconsistent Special Character Storage In String

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python字符串中特殊字符存储不一致 [英] Python Inconsistent Special Character Storage In String

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭