磁盘寻道时间测量方法 [英] Disk seek time measurement method

查看:90
本文介绍了磁盘寻道时间测量方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个脚本来测量HDD上的寻道时间,并对其完成结果的方式有微小的变化。

I write a script to measure seek times on a HDD and a small change in how its done results in dramatically different times.

第一个循环在磁盘开头的区域内进行跳转。第二个循环选择磁盘上执行搜索的随机区域(大小相同)。 这种方法显然不同,但我不明白为什么会改变结果?请注意,对于大面积测量,两种方法都收敛。

First cycle makes jumps within an area at beginning of the disk. Second cycle selects random areas (of same size) on disk where seeks are performed. This approach is clearly different but I dont understand why it would change results? Notice that for large areas measurements converge for both methods.

字节*方法只是很好地格式化数字(1024<-> 1KB)。脚本必须在根目录下运行。磁盘默认为sdb。

Bytes* methods just format numbers nicely (1024 <-> "1KB"). Script must be run under root. Disk is sdb, by default.

import sys, os, time, random


#--------------------------------------------------------------------------------------------------

def BytesString(n):
    suffixes = ['B','KB','MB','GB','TB','PB','EB','ZB','YB']
    suffix = 0
    while n % 1024 == 0 and suffix+1 < len(suffixes):
        suffix += 1
        n /= 1024
    return '{0}{1}'.format(n, suffixes[suffix])

def BytesInt(s):
    if all(c in '0123456789' for c in s):
        return int(s)
    suffixes = ['B','KB','MB','GB','TB','PB','EB','ZB','YB']
    for power,suffix in reversed(list(enumerate(suffixes))):
        if s.endswith(suffix):
            return int(s.rstrip(suffix))*1024**power
    raise ValueError('BytesInt requires proper suffix ('+' '.join(suffixes)+').')

def BytesStringFloat(n):
    x = float(n)
    suffixes = ['B','KB','MB','GB','TB','PB','EB','ZB','YB']
    suffix = 0
    while x > 1024.0 and suffix+1 < len(suffixes):
        suffix += 1
        x /= 1024.0
    return '{0:0.2f}{1}'.format(x, suffixes[suffix])


#--------------------------------------------------------------------------------------------------

disk = open('/dev/sdb', 'r')
disk.seek(0,2)
disksize = disk.tell()
os.system('echo noop | sudo tee /sys/block/sdb/queue/scheduler > /dev/null')

print 'Syntax: progam [-s -sr -t -tr] [-v]:  to run specific modes; for verbose mode.'
print 'Disk name: {0}  Disk size: {1}  Scheduler disabled.'.format(
    disk.name, BytesStringFloat(disksize))

displaytimes = '-v' in sys.argv


#--------------------------------------------------------------------------------------------------

bufsize = 512
bufcount = 100
displaysamplecount = 24

for randomareas in [False,True]:
    print
    print 'Measuring: Random seek time {0}'.format(
        'using random areas of disk.' if randomareas else 'using beginning of disk.')
    print 'Samples: {0}{1}   Sample size: {2}'.format(
        bufcount, ' (displayed {0})'.format(displaysamplecount) if displaytimes else '', bufsize)

    for area in [BytesInt('1MB')*2**i for i in range(0,64)]+[disksize]:
        if area > disksize:
            continue

        os.system('echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null')

        times = []
        disk.seek(0)
        disk.read(bufsize)
        for _ in range(bufcount):
            left = random.randint(0, disksize-area) if randomareas else 0
            right = left + random.randint(0, area)
            disk.seek(left)
            disk.read(bufsize)
            start = time.time()
            disk.seek(right)
            disk.read(bufsize)
            finish = time.time()
            times.append(finish-start)

        times = sorted(times)[:bufcount*95/100]
        print 'Area tested: {0:6}   Average: {1:5.2f} ms   Max: {2:5.2f} ms   Total: {3:0.2f} sec'.format(
            BytesString(area) if area < disksize else BytesStringFloat(area), 
            sum(times)/len(times)*1000, max(times)*1000, sum(times))
        if displaytimes:
            print 'Read times: {0} ... {1} ms'.format(
                ' '.join(['{0:0.2f}'.format(x*1000) for x in times[:displaysamplecount/2]]), 
                ' '.join(['{0:0.2f}'.format(x*1000) for x in times[-displaysamplecount/2:]]))







Measuring: Random seek time using beginning of disk.
Samples: 100   Sample size: 512
Area tested: 1MB      Average:  0.14 ms   Max:  0.35 ms   Total: 0.01 sec
Area tested: 2MB      Average:  0.16 ms   Max:  0.31 ms   Total: 0.02 sec
Area tested: 4MB      Average:  0.20 ms   Max:  0.75 ms   Total: 0.02 sec
Area tested: 8MB      Average:  0.19 ms   Max:  0.97 ms   Total: 0.02 sec
Area tested: 16MB     Average:  0.64 ms   Max:  7.97 ms   Total: 0.06 sec
Area tested: 32MB     Average:  2.29 ms   Max: 10.56 ms   Total: 0.22 sec
Area tested: 64MB     Average:  3.89 ms   Max: 12.25 ms   Total: 0.37 sec
Area tested: 128MB    Average:  6.32 ms   Max: 13.18 ms   Total: 0.60 sec
Area tested: 256MB    Average:  6.73 ms   Max: 13.04 ms   Total: 0.64 sec
Area tested: 512MB    Average:  7.43 ms   Max: 13.72 ms   Total: 0.71 sec
Area tested: 1GB      Average:  8.38 ms   Max: 13.59 ms   Total: 0.80 sec
Area tested: 2GB      Average:  8.51 ms   Max: 13.81 ms   Total: 0.81 sec
Area tested: 4GB      Average:  8.87 ms   Max: 13.86 ms   Total: 0.84 sec
Area tested: 8GB      Average:  9.82 ms   Max: 14.66 ms   Total: 0.93 sec
Area tested: 16GB     Average:  9.73 ms   Max: 15.95 ms   Total: 0.92 sec
Area tested: 32GB     Average:  9.89 ms   Max: 15.18 ms   Total: 0.94 sec
Area tested: 64GB     Average: 10.60 ms   Max: 15.85 ms   Total: 1.01 sec
Area tested: 128GB    Average: 11.18 ms   Max: 18.68 ms   Total: 1.06 sec
Area tested: 256GB    Average: 13.31 ms   Max: 30.94 ms   Total: 1.26 sec
Area tested: 512GB    Average: 14.14 ms   Max: 31.70 ms   Total: 1.34 sec
Area tested: 1TB      Average: 15.20 ms   Max: 33.35 ms   Total: 1.44 sec
Area tested: 1.36TB   Average: 15.47 ms   Max: 25.30 ms   Total: 1.47 sec

Measuring: Random seek time using random areas of disk.
Samples: 100   Sample size: 512
Area tested: 1MB      Average:  7.21 ms   Max: 35.94 ms   Total: 0.69 sec
Area tested: 2MB      Average:  5.40 ms   Max: 12.92 ms   Total: 0.51 sec
Area tested: 4MB      Average:  6.97 ms   Max: 36.60 ms   Total: 0.66 sec
Area tested: 8MB      Average:  7.24 ms   Max: 15.05 ms   Total: 0.69 sec
Area tested: 16MB     Average:  7.36 ms   Max: 13.03 ms   Total: 0.70 sec
Area tested: 32MB     Average:  7.34 ms   Max: 12.30 ms   Total: 0.70 sec
Area tested: 64MB     Average:  7.35 ms   Max: 13.47 ms   Total: 0.70 sec
Area tested: 128MB    Average:  7.66 ms   Max: 13.37 ms   Total: 0.73 sec
Area tested: 256MB    Average:  7.93 ms   Max: 13.34 ms   Total: 0.75 sec
Area tested: 512MB    Average: 10.16 ms   Max: 39.67 ms   Total: 0.97 sec
Area tested: 1GB      Average:  8.76 ms   Max: 14.38 ms   Total: 0.83 sec
Area tested: 2GB      Average:  9.42 ms   Max: 17.74 ms   Total: 0.89 sec
Area tested: 4GB      Average: 11.00 ms   Max: 23.22 ms   Total: 1.05 sec
Area tested: 8GB      Average: 10.59 ms   Max: 19.60 ms   Total: 1.01 sec
Area tested: 16GB     Average: 10.91 ms   Max: 19.15 ms   Total: 1.04 sec
Area tested: 32GB     Average: 11.19 ms   Max: 26.02 ms   Total: 1.06 sec
Area tested: 64GB     Average: 12.59 ms   Max: 26.49 ms   Total: 1.20 sec
Area tested: 128GB    Average: 11.97 ms   Max: 19.30 ms   Total: 1.14 sec
Area tested: 256GB    Average: 12.61 ms   Max: 22.84 ms   Total: 1.20 sec
Area tested: 512GB    Average: 13.62 ms   Max: 20.48 ms   Total: 1.29 sec
Area tested: 1TB      Average: 16.72 ms   Max: 29.20 ms   Total: 1.59 sec
Area tested: 1.36TB   Average: 15.96 ms   Max: 26.21 ms   Total: 1.52 sec


推荐答案

现代HDD具有内置缓存-如果您读到一些逻辑将在内部缓存其周围的区域,如果您下次阅读附近的内容,它将对

Modern HDDs have built-in caching - if you read a position "some logic" will cache areas around it internally and if you read something near it next time it will provide data from the cache if present else read from disk.

从磁盘的开头读取

Measuring: Random seek time using beginning of disk.
Samples: 100   Sample size: 512
Area tested: 1MB      Average:  0.14 ms   Max:  0.35 ms   Total: 0.01 sec

将从那里缓存内容-连续读取将从(更快的)缓存中读取。

will cache things from there - successive reads will read from the (faster) cache.

读取随机位置:

Measuring: Random seek time using random areas of disk.
Samples: 100   Sample size: 512
Area tested: 1MB      Average:  7.21 ms   Max: 35.94 ms   Total: 0.69 sec

将无法从缓存中读取-除非您多次读取相同的随机位置。

will not be able to read from cache - unless you read "the same random location" multiple times after each other.

您的代码不使用 same 随机区域100次:

Your code does not use the same random area 100 times:


for _ in range(bufcount):
    left = random.randint(0, disksize-area) if randomareas else 0
    right = left + random.randint(0, area)
    disk.seek(left)
    disk.read(bufsize)
    start = time.time()
    disk.seek(right)
    disk.read(bufsize)
    finish = time.time()
    times.append(finish-start)


它将为100个 bufcounts中的每一个创建新的 / code>-如果您随机寻找,因此您不会从HDD缓存中获利(大多数情况下,叔叔的随机命中率是相似的数字)。

It creates new left and right for every one of the 100 bufcounts - if you are randomly seeking so you do not profit from the HDDs cache (most of the time, unless random hits similar numbers by sheer chance).

这篇关于磁盘寻道时间测量方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆