什么是AWS S3存储桶的Unicode规范化表格 [英] What is the Unicode normalization form for an AWS S3 Buckets

查看:115
本文介绍了什么是AWS S3存储桶的Unicode规范化表格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在AWS s3存储桶上使用UTF-8格式的文件名时,我发现某些引用的文件名(在s3存储桶的链接中)可能与相同的文件名有所不同被我的python应用程序的代码引用(我正在使用boto库).正如我发现的那样,由于 unicode的规范化形式,它们有所不同,使用后问题消失了 unicodedata.normalize .

Upon working with file names which are in UTF-8 format on AWS s3 bucket, I've found out that some of the quoted file names( in a Link to a file on s3 bucket) may differ from same file names which were quoted by code of my python app ( I'am using boto library). As I've found out they differs due to different normalization forms of unicode and problem goes away after using unicodedata.normalize.

但是我还没有找到AWS正在使用的任何有关规范化形式的信息(NFCNFKCNFDNFKD),因此,我非常感谢任何建议使用trasted source的建议信息,谢谢.

However I haven't found any information about normalization form which being used by AWS ( NFC, NFKC, NFD or NFKD), so I will highly appreciate any suggestance of trasted source which provides that information, thanks.

推荐答案

似乎S3本身未应用任何规范化.如果我从Mac(再从Windows)将具有统一名称(例如Ärende.txt)的文件(使用S3 Web控制台)上载到S3,然后在S3中得到两个文件.它们在S3控制台中看起来相同,但是S3认为它们是不同的,因为名称的编码不同.

It looks like S3 doesn't apply any normalization itself. If I upload (using the S3 web console) a file with a unicode name (eg Ärende.txt) to S3 from a Mac and again from Windows, I'll end up with two files in S3. They look the same in the S3 console, but they are considered distinct by S3 because the encoding of the name is different.

您将必须仔细考虑它如何影响您的应用程序(用户)并做出相应调整.例如,如果您的用户可能在环境(Mac vs Windows vs Linux)之间切换并期望一致的跨平台行为,那么您似乎需要自己对名称进行规范化.如果您的用户始终在单个平台上工作,那么您就不必担心.

You will have to consider exactly how it affects your application (users) and adjust accordingly. For example, if your users may switch between environments (Mac vs Windows vs Linux) and expect consistent cross-platform behaviour, then it seems you will need to normalize the names yourself. If your users work from a single platform consistently, then you wouldn't need to care most likely.

这篇关于什么是AWS S3存储桶的Unicode规范化表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆