PHP 文件名(或其完整路径中的目录)可以包含 UTF-8 字符吗? [英] Can a PHP file name (or a dir in its full path) have UTF-8 characters?

查看:20
本文介绍了PHP 文件名(或其完整路径中的目录)可以包含 UTF-8 字符吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想访问名称中包含 UTF-8 字符的 PHP 文件.

I would like to access a PHP file whose name has UTF-8 characters in it.

该文件中没有 BOM.它只包含一个显示几个 unicode 字符的 echo 语句.

The file does not have a BOM in it. It just contains an echo statement that displays a few unicode characters.

从浏览器(FireFox 3.0.8、IE7)访问 PHP 页面导致 HTTP 错误 500.

Accessing the PHP page from the browser (FireFox 3.0.8, IE7) results in HTTP error 500.

Apache日志中有两个条目(文件是/க.php;字母க是复合的,对应下面日志中的字符\xe0\xae\x95):

There are two entries in the Apache log (file is /க.php; the letter க is a composite one and corresponds to the characters \xe0\xae\x95 in the log below):

[Sat Apr 04 09:30:25 2009] [error] [client 127.0.0.1] PHP 警告:未知:无法打开流:第 0 行未知中没有此类文件或目录

[Sat Apr 04 09:30:25 2009] [error] [client 127.0.0.1] PHP Warning: Unknown: failed to open stream: No such file or directory in Unknown on line 0

[Sat Apr 04 09:30:25 2009] [error] [client 127.0.0.1] PHP Fatal error: Unknown: Failed opening required 'D:/va/ROOT/\xe0\xae\x95.php' (include_path='.;C:\php5\pear') in Unknown on line 0

[Sat Apr 04 09:30:25 2009] [error] [client 127.0.0.1] PHP Fatal error: Unknown: Failed opening required 'D:/va/ROOT/\xe0\xae\x95.php' (include_path='.;C:\php5\pear') in Unknown on line 0

当文件和目录名称为英文时,相同的页面有效.在相同的设置下,这些页面使用 SSI 没有问题.

The same page works when file and dir names are in English. In the same setup, there is no problem using SSI for these pages.

编辑

删除了关于 url 重写的信息,因为它似乎不是一个因素.

Removed info on url rewriting since it does not seem to be a factor.

去掉mod_rewrite后,PHP文件还是不行.如果文件重命名为非 UTF 名称,则有效.但是,shtml 甚至可以在文件和/或路径名中使用 UTF 字符.

When mod_rewrite is removed, the PHP file still does not work. Works if the file is renamed to a non-UTF name. However, shtml works even with UTF characters in file and/or path name.

推荐答案

我遇到了同样的问题并做了一些研究并得出以下结论.这适用于 Windows 上的 php5;在其他平台上可能确实如此,但我没有检查过.

I have come across the same problem and done some research and conclude the following. This is for php5 on Windows; it is probably true on other platforms but I haven't checked.

  1. 所有 php 文件系统函数(dir、is_dir、is_file、file、filemtime、filesize、file_exists 等)只接受和返回 ISO-8859-1 中的文件名,而不管程序或 ini 中设置的 default_charset文件.

  1. ALL php file system functions (dir, is_dir, is_file, file, filemtime, filesize, file_exists etc) only accept and return file names in ISO-8859-1, irrespective of the default_charset set in the program or ini files.

如果文件名包含 unicode 字符,dir->read 会将其作为对应的 ISO-8859-1 字符返回,否则将替换为问号.

Where a filename contains a unicode character dir->read will return it as the corresponding ISO-8859-1 character if there is one, otherwise it will substitute a question mark.

引用文件时,例如在 is_file 或 file 中,如果您传入 UTF-8 文件名,则当名称包含任何两个字节或更多字符时,将找不到该文件.但是,如果 UTF-8 字符可以在 ISO-8859-1 中表示,is_file(utf8_decode($filename)) 等将起作用.

When referencing a file, e.g. in is_file or file, if you pass in a UTF-8 file name the file will not be found when the name contains any two-byte or more characters. However, is_file(utf8_decode($filename)) etc will work providing the UTF-8 character is representable in ISO-8859-1.

换句话说,PHP5 根本无法处理名称中包含多字节字符的文件.

In other words, PHP5 is not capable of addressing files with multi-byte characters in their names at all.

如果请求带有多字节字符的 UTF-8 URL 并且这直接对应于一个文件,PHP 将无法打开该文件,因为它无法对其进行寻址.

If a UTF-8 URL with multibyte characters is requested and this corresponds directly to a file, PHP won't be able to open the file because it cannot address it.

如果您只是想要在您的语言中使用漂亮的 URL,那么使用 mod_rewrite 的建议似乎是一个不错的建议.

If you simply want pretty URLs in your language the suggestion of using mod_rewrite seems like a good one.

但是如果你存储和检索用户上传和下载的文件,这个问题必须解决.一种方法是在服务器上使用任意(非 UTF-8)文件名,例如递增的数字,并在数据库或 XML 文件等中为文件编制索引.另一种方法是将文件作为 BLOB 存储在数据库本身中.另一种方法(这可能更容易看到发生了什么,并且如果您的索引损坏则不会出现问题)是自己对文件名进行编码 - 一个很好的技术是在存储在服务器上时对所有传入的文件名进行 urlencode (sic)在下载的 mime 标头中设置文件名之前,请先对它们进行磁盘和 urldecode.所有甚至模糊不寻常的字符(% 除外)都被编码为 %nn,因此在很大程度上避免了文件名中的空格、跨平台支持和模式匹配的任何问题.

But if you are storing and retrieving files uploaded and downloaded by users, this problem has to be resolved. One way is to use an arbitrary (non UTF-8) file name, such as an incrementing number, on the server and index the files in a database or XML file or some such. Another way is to store the files in the database itself as a BLOB. Another way (which is perhaps easier to see what is going on, and not subject to problems if your index gets corrupted) is to encode the filenames yourself - a good technique is to urlencode (sic) all your incoming filenames when storing on the server disk and urldecode them before setting the filename in the mime header for the download. All even vaguely unusual characters (except %) are then encoded as %nn and so any problems with spaces in file names, cross platform support and pattern matching are largely avoided.

这篇关于PHP 文件名(或其完整路径中的目录)可以包含 UTF-8 字符吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆