php readdir问题与日语文件名 [英] php readdir problem with japanese language file name

查看:97
本文介绍了php readdir问题与日语文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码

<?php
if ($handle = opendir('C:/xampp/htdocs/movies')) {
    while (false !== ($file = readdir($handle))) {
        if ($file != "." && $file != "..") {
            echo $file."<br />\n";
        }
    }
    closedir($handle);
}
?>

当它有mb之类的语言(例如日语)时,它不能正确显示,而是显示为kyuukyoku Choujin R ?????〜?而不是kyuukyoku Choujin R究极超人あ〜る

When it does have mb language such as japanese, it doesn't display properly instead it display like kyuukyoku Choujin R ?????~? rather then kyuukyoku Choujin R 究極超人あ~る

是否要显示正确的名称或其他人仍然可以下载它?

Anyway to make it display the correct name or make it still download-able by others?

感谢您的帮助:)

推荐答案

我无法确切地说出PHP的含义,但我怀疑它与Python 2一样存在基本问题(在以后为Unicode字符串文件名添加特殊支持之前)

I can't speak definitively for PHP, but I suspect it's the same basic problem as with Python 2 had (before later adding special support for Unicode string filenames).

我的信念是,PHP正在使用基于字节的标准C库"open" -et-al函数来处理文件名.在Windows(NT)上,它们尝试使用系统代码页对真实的Unicode文件名进行编码.对于西方机器,可能是cp1252(类似于ISO-8859-1),而对于日本机器,可能是cp932(类似于Shift-JIS).对于系统代码页中不存在的任何字符,您将获得一个?"字符,并且您将无法引用该文件.

My belief is that PHP is dealing with filenames using the standard C library ‘open’-et-al functions, which are byte-based. On Windows (NT) these try to encode the real Unicode filename using the system codepage. That might be cp1252 (similar to ISO-8859-1) for Western machines, or cp932 (similar to Shift-JIS) on Japanese machines. For any characters that don't exist in the system codepage you will get a ‘?’ character, and you'll be unable to refer to that file.

要解决此问题,PHP必须做与Python 3.0相同的工作,并开始使用Unicode字符串作为文件名(及其他所有内容),使用'_wopen'-et-al函数来获取对文件名的本机Unicode访问在Windows下.我希望这会在PHP6中发生,但是目前您可能已经塞满了.您可以将系统代码页更改为cp932以访问文件名,但对于其他非Shift-JIS中的Unicode字符,您仍然会获得'?'字符,无论如何,您实际上不要希望使应用程序的内部字符串全部使用Shift-JIS,因为这是一种非常糟糕的编码.

To get around this problem PHP would have to do the same as Python 3.0 and start using Unicode strings for filenames (and everything else), using the ‘_wopen’-et-al functions to get native-Unicode access to the filenames under Windows. I expect this will happen in PHP6, but for the moment you're probably pretty much stuffed. You could change the system codepage to cp932 to get access to the filenames, but you'd still get ‘?’ characters for any other Unicode characters not in Shift-JIS, and in any case you really don't want to make your application's internal strings all Shift-JIS as it's quite a horrible encoding.

如果是您自己的脚本来选择如何存储文件,我强烈建议您在本地使用简单的基于主键的文件名(例如"4356"),将真实文件名放入数据库中,并使用重写/跟踪服务文件URL中的路径部分.很难将用户提供的文件名保存在自己的本地文件名中,这是即使不担心Unicode也会造成安全灾难的秘诀.

If it's your own scripts choosing how to store files, I'd strongly suggest using simple primary-key-based filenames like ‘4356’ locally, putting the real filename in a database, and serving the files up using rewrites/trailing path parts in the URL. Keeping user-supplied filenames in your own local filenames is difficult and a recipe for security disasters even without having to worry about Unicode.

这篇关于php readdir问题与日语文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆