php"glob"和重复数据删除? [英] php "glob" and data-deduplication?
问题描述
我有一个php应用程序,它(按请求)扫描某些文件的存在. (在网络共享上)
I have a php-application which is (per request) scanning for the existance of some files. (on a network share)
我为此使用glob
,因为通常我只知道文件名的开头.
I'm using glob
for this, cause usually i just know the beginning of the filename.
我注意到,glob
不会返回任何客户端当前打开的文件,因此我的应用程序认为file_xy
不存在(如果有人打开了该文件).
I noticed, that glob
does not return files, that are currently opened by any client, thus my application thinks file_xy
is not existing, if somebody has opened it.
是否也可以使glob
返回打开的(:=锁定?)文件?
Is there a way to make glob
return opened (:= locked?) files as well?
奇怪的是,这里没有提到.但是我可以确认glob不会返回客户端当前打开的文件...(一旦客户端关闭正在访问的应用程序,glob
将照常返回文件)
The strange thing is, that this is no where mentioned. However I can confirm that glob is NOT returning files, that are currently opened by a client... (As soon as the client closes the accessing application, glob
will return the file as usual)
ps .:只要打开文件,glob("\\server\share\*")
都不返回文件. (网络共享允许并发用户的最大数量)
ps.: not even glob("\\server\share\*")
is returning the file as long as its opened. (Network Share allows the maximum number of concurrent users)
$dir = opendir ("\\server\share");
while ($file = readdir($dir)){
echo $file."<br />";
}
可以很好地显示有问题的文件,无论是否由其他客户端打开. -所以我几乎可以排除任何访问限制/权限...
shows the file in question perfectly fine, no matter if opened by another client or not. - So I can almost exclude any access-limit / permission thingy...
即使我现在不知道原因,我也想出了原因:
I figured out the cause even if I do not know the reason now:
当文件位于使用Windows Server 2012 R2内置重复数据删除功能的驱动器上时,出现glob()
找不到打开的文件的问题.
The Issue with glob()
not finding an opened file appears, when the file is located on a drive that's using Windows Server 2012 R2 build in data-deduplication feature.
如果我将文件移至非重复数据删除的共享,则即使由多个客户端打开,glob()
也可以读取它.
If I move the file to a non deduplicated share, glob()
can read it, even when opened by multiple clients.
由于我有一个可行的替代方法,因此该问题应主要集中于问题为什么不起作用-还是说这里的工作方式有所不同. glob
和readdir
在访问底层文件系统以确定内容的方式上必须有所不同.
Since I have a working alternative, this question should mainly focus on the question why glob does not work - or let's say work different here. There has to be a difference in how glob
and readdir
are accessing the underlaying filesystem to determine the contents.
还有另一个证明,这与重复数据删除有关:我将该功能配置为仅"对3天以上的重复数据删除文件进行配置.
There is another proof, that this relates to data-deduplication: I configured the feature to "only" deduplicate files older than 3 days.
我设置了一个cronjob,打开并查看"共享中的某个文件.大约3天后(Windows决定何时进行重复数据删除),glob无法在另一个客户端打开该文件时列出该文件.
I set up a cronjob, "opening and globing" a certain file on the share. Once it was ~ 3 days old (Windows decides when to deduplicate), glob failed to list the file while its opened by another client.
因此,glob能够找到打开的文件,该文件已在开始的三天内被复制到共享中-一旦删除了重复数据,它就会开始丢失它.
Thus, glob is able to find open files, that has been copied to the share WITHIN the first 3 days - and then starts to miss it, once it has been deduplicated.
glob
失败,导致该帖子:-)
glob
fails, causing this post :-)
使用提到的scandir
函数可以显示完全相同的行为:
Using the mentioned scandir
function shows the very same behavior:
-
客户端打开的
- 重复数据删除文件-在结果数组中丢失.
- 客户端未打开重复数据删除文件-生成的数组的一部分.
- deduplicated file opened by a client - missing in the resulting array.
- deduplicated file not opened by a client - part of the resulting array.
我想再次加下划线,在两种情况下opendir
和readdir
都可以使用.
I want to underline again, that opendir
along with readdir
works in both cases.
这也随时产生了预期的结果.
This produced the expected result at any time as well.
我注意到,重复数据删除的文件显示为"0字节的硬盘大小",而尚未重复数据删除的文件(成功找到)显示为逻辑上占用的大小(基于文件系统的簇大小) :
I noted, that deduplicated files are shown with a "Size on Harddrive" of 0 Bytes, while not yet deduplicated files (which are successfully found) are shown with the size they are logically occupying (based on filesystems cluster-size):
但是,这不能解释为什么文件是否由客户端打开会有所不同的原因.尺寸报告在任何时候都是相等的.
However this would not explain why it makes a difference whether a file is opened by a client or not. Size report is equal at any time.
推荐答案
您尝试过
$files = glob('{,.}*', GLOB_BRACE);
重复数据删除功能可能会将打开的文件保留为隐藏文件.
It might be possible that the data de-dupe feature is keeping the opened file as a hidden file.
这篇关于php"glob"和重复数据删除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!