是使用istream :: seekg太贵了吗? [英] Is using istream::seekg too much expensive?

查看:309
本文介绍了是使用istream :: seekg太贵了吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在c ++中,使用istream :: seekg操作有多昂贵?

In c++, how expensive is it to use the istream::seekg operation?

编辑: ?频率与偏移幅度有什么关系?

How much can I get away with seeking around a file and reading bytes? What about frequency versus magnitude of offset?

我有一个大文件(4GB),我正在解析,我想知道是否有必要尝试整合一些我的seekg调用。我假设文件位置的差异的大小发挥作用 - 就像你在内存中找到多于一页的内容,它会影响性能 - 但小寻求没有后果。

I have a large file (4GB) that I am parsing, and I want to know if it's necessary to try to consolidate some of my seekg calls. I would assume that the magnitude of differences in file location play a role--like if you seek more than a page in memory away, it will impact performance--but small seeking is of no consequence. Is this correct?

推荐答案

这个问题主要取决于您的操作系统和磁盘子系统。

This question is heavily dependent on your operating system and disk subsystem.

显然,seek本身将基本上为零时间,因为它只是更新偏移量。实际上读取将从磁盘中取出一些数据...

Obviously, the seek itself will take essentially zero time, since it just updates an offset. Actually reading will pull some data off of disk...

...但是有多少数据取决于很多事情。您的磁盘有一个缓存,可能有自己的块大小,可以做某种预读。您的RAID控制器(如果有)将有自己的缓存,可能有自己的块大小和预读。

...but how much data depends on many things. Your disk has a cache which may have its own block size and may do some sort of read-ahead. Your RAID controller (if any) will have its own cache, possibly with its own block size and read-ahead.

您的内核有一个页面缓存 - RAM,基本上 - 它也可能做某种预读。在Linux上,这是可配置的,内核将根据你的访问模式的顺序来调整它,无论你是否调用 posix_fadvise 等。

Your kernel has a page cache -- all of free RAM, essentially -- and it will also probably do some sort of read-ahead. On Linux this is configurable, and the kernel will adapt it based on how sequential your access patterns appear to be, whether you have called posix_fadvise, etc.

所有这些缓存意味着如果你访问一些数据,然后访问附近的数据,第二次访问有可能根本不会接触磁盘。

All of these caches mean if you access some data, then access nearby data later, there is a chance the second access will not actually touch the disk at all.

如果你有编码的选择,以便你顺序访问文件,那肯定会比随机读取,特别是小随机读取更快。在单个机械磁盘上​​寻找类似10ms的时间,所以你可以在这里做数学。 (虽然在固态驱动器上的搜索速度大约快了100倍。)

If you have the option of coding so that you access the file sequentially, that is certainly going to be faster than random reads, especially small random reads. Seeking on a single mechanical disk takes something like 10ms, so you can do the math here. (Although seeking on a solid state drive is around 100 times faster.)

大型读取通常比小型读取更好...虽然每次处理数据几千字节

Large reads are generally better than small reads... Although processing data a few kilobytes at a time can be faster than larger blocks if it allows the processing to stay in cache.

简而言之,您需要提供更多有关您的系统和应用程序的详细信息得到正确的答案,甚至那时最有可能的答案是基准。

In short, you will need to provide a lot more details about your system and your application to get a proper answer, and even then the most likely answer is "benchmark it".

这篇关于是使用istream :: seekg太贵了吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆