最快的方式来比较目录状态,或哈希的乐趣和利润 [英] Fastest way to compare directory state, or hashing for fun and profit

查看:126
本文介绍了最快的方式来比较目录状态,或哈希的乐趣和利润的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个PHP应用程序,并认为它可能是有利的应用程序知道是否有自上次执行在化妆的变化。主要由于管理和缓存等,并知道我们的应用程序有时谁不记得清除变化缓存人访问。 (改变人们的是明显的答案,但很可惜,没有真正实现的)

We have a PHP application, and were thinking it might be advantageous to have the application know if there was a change in its makeup since the last execution. Mainly due to managing caches and such, and knowing that our applications are sometimes accessed by people who don't remember to clear the cache on changes. (Changing the people is the obvious answer, but alas, not really achievable)

我们已经提出了这一点,这是我们已经成功地勉强维持,运行开发计算机平均0.08一个典型项目是最快的。我们已经尝试了shasum,MD5和CRC32,这是最快的。我们基本上md5ing每个文件的内容,并且md5'ing该输出。保安心不是一个问题,我们在通过不同的校验和检测文件系统的变化只是感兴趣。

We've come up with this, which is the fastest we've managed to eke out, running an average 0.08 on a developer machine for a typical project. We've experimented with shasum,md5 and crc32, and this is the fastest. We are basically md5ing the contents of every file, and md5'ing that output. Security isnt a concern, we're just interested in detecting filesystem changes via a differing checksum.

time (find application/ -path '*/.svn' -prune -o -type f -print0 | xargs -0 md5 | md5)

我想问题是,是否可以进一步优化?

I suppose the question is, can this be optimised any further?

(我认识到,修剪SVN将有一个成本,却发现花的时间最少出的组件,所以这将是pretty最小的。我们在工作拷贝ATM测试这个)

(I realise that pruning svn will have a cost, but find takes the least amount of time out of the components, so it will be pretty minimal. We're testing this on a working copy atm)

推荐答案

我们不想使用FAM,因为我们需要在服务器上安装它,那不是总是可能的。有时候,客户是我们坚持部署他们打破基础设施。由于它的停产,它很难得到改变批准繁文缛节也明智的。

We didn't want to use FAM, since we would need to install it on the server, and thats not always possible. Sometimes clients are insistent we deploy on their broken infrastructure. Since it's discontinued, its hard to get that change approved red tape wise also.

要改善的问题原始版本的速度的唯一方法是,以确保您的文件列表是尽可能简洁。 IE只哈希目录/文件,如果改变了真正的问题。切割出不相关的可以给大的速度提升目录。

The only way to improve the speed of the original version in the question is to make sure your file list is as succinct as possible. IE only hash the directories/files that really matter if changed. Cutting out directories that aren't relevant can give big speed boosts.

既往认为,该应用程序是使用功能,以检查是否有为了执行高速缓存,如果明确有变化。因为我们真的不希望停止,而其这样做的应用,这样的事情最好谨慎使用fsockopen养殖出来作为一个异步的过程。这提供了最好的速度提升的整体,只是小心的竞争条件。

Past that, the application was using the function to check if there were changes in order to perform a cache clear if there were. Since we don't really want to halt the application while its doing this, this sort of thing is best farmed out carefully as an asynchronous process using fsockopen. That gives the best 'speed boost' overall, just be careful of race conditions.

标记此为接听和upvoting的FAM答案。

Marking this as the 'answer' and upvoting the FAM answer.

这篇关于最快的方式来比较目录状态,或哈希的乐趣和利润的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆