关于文件系统缓存的问题 [英] A question about cache of filesystem

查看:182
本文介绍了关于文件系统缓存的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在文件系统中读取一个大文件时,缓存能否提高
操作的速度?我认为有两种不同的答案:



1.是的。因为高速缓存可以预取,所以性能会得到改善。

2.没有。因为从缓存中读取的速度比从
磁盘读取的速度要快,最后我们可以发现缓存没有帮助,所以读取速度也是
读取的速度从磁盘。



哪一个是正确的?我怎样才能证明答案?


$ b另外一个问题是:

我不确定的是
,当你打开缓存时,带宽被用于

1.prefetch



2.prefetch和read


哪一个是正确的?

如果您
关闭缓存,磁带的带宽只是用来读取。

如果我关闭缓存并随机访问磁盘,是需要的时间与开启缓存时顺序读取的时间相当?

解决方案

<1>绝对正确。当您的代码正在处理已经收到的数据时,操作系统可以从磁盘提取到缓存。是的,磁盘可能仍然是瓶颈 - 但你不会有读取,处理,读取,处理,读取,处理,而是读取+处理,读取+处理,读取+处理。例如,假设我们有一半的阅读时间。代表时间往下走,我们可能会有这样的活动,不用预取:

 

处理
读取
读取
处理
读取
读取
处理

在预取的情况下,优化为:

 阅读
阅读
读取进程
读取
读取进程
读取
进程



基本上总的时间将是读取整个文件的时间+处理最后一个数据的时间,而不是读取整个文件的时间+处理整个文件的时间。

测试是非常棘手的 - 您需要有一个可以调整或关闭缓存的操作系统。另一种方法是更改​​打开文件的方式 - 例如,如果使用 FileOptions.SequentialScan 缓存更有可能做正确的事情。尝试使用和不使用这个选项。

这个主要是关于预取的 - 一般的缓存(即使把数据交付给应用程序之后也保留)是另外一回事。如果您想多次使用相同的数据,显然会成为一个巨大的胜利。还有一些之间的应用程序只需要少量的数据,但磁盘已经读取了一个完整的块 - 操作系统没有主动预取未请求的块,但可以缓存整个块如果应用程序然后从同一块请求更多的数据,它可以从缓存中返回数据。


When I read a large file in the file system, can the cache improve the speed of the operation?

I think there are two different answers:

1.Yes. Because cache can prefetch thus the performance gets improved.

2.No. Because the speed to read from cache is more faster than the speed to read from disk, at the end we can find that the cache doesn't help,so the reading speed is also the speed to read from disk.

Which one is correct? How can I testify the answer?

[edit]

And another questions is :

What I am not sure is that, when you turn on the cache the bandwidth is used to

1.prefetch

2.prefetch and read

which one is correct?

While if you turn off the cache ,the bandwith of disk is just used to read.

If I turn off the cache and randomly access the disk, is the time needed comparable with the time when read sequentially with the cache turned on?

解决方案

1 is definitely correct. The operating system can fetch from the disk to the cache while your code is processing the data it's already received. Yes, the disk may well still be the bottleneck - but you won't have read, process, read, process, read, process, but read+process, read+process, read+process. For example, suppose we have processing which takes half the time of reading. Representing time going down the page, we might have this sort of activity without prefetching:

Read
Read
Process
Read
Read
Process
Read
Read
Process

Whereas with prefetching, this is optimised to:

Read
Read
Read     Process
Read
Read     Process
Read
         Process

Basically the total time will be "time to read whole file + time to process last piece of data" instead of "time to read whole file + time to process whole file".

Testing it is tricky - you'll need to have an operating system where you can tweak or turn off the cache. Another alternative is to change how you're opening the file - for instance, in .NET if you open the file with FileOptions.SequentialScan the cache is more likely to do the right thing. Try with and without that option.

This has spoken mostly about prefetching - general caching (keeping the data even after it's been delivered to the application) is a different matter, and obviously acts as a big win if you want to use the same data more than once. There's also "something in between" where the application has only requested a small amount of data, but the disk has read a whole block - the OS isn't actively prefetching blocks which haven't been requested, but can cache the whole block so that if the app then requests more data from the same block it can return that data from the cache.

这篇关于关于文件系统缓存的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆