覆盖磁带上的文件 [英] overwrite a file on tape

查看:148
本文介绍了覆盖磁带上的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个程序,在磁带上存储大量数据(100 PB的数据)。我正在使用tar将文件分组在一起,但是出于技术原因,我决定在一个磁带中写入多个tar。

I'm trying to write a program to store large amount of data (100s of PB) on tapes. I'm using tar to group files together, but for technical reasons I've decided to write multiple tars in one tape.

为了轻松查找正在存储的数据磁带,我决定创建一个小索引并将其写入磁带的开头。所以我正在做这样的事情:

In order to easily find what data are on a tape, I've decided to create a small index and write it at the beginning of the tape. So I'm doing something like this:

# create an empty index file
head -c 1M < /dev/urandom > index.txt

# rewind tape
mt -f /dev/nst0 rewind

# write index to the beginning of the tape
dd bs=4k if=index.txt of=/dev/nst0


# write tar file to tape
dd bs=4k if=one.tar of=/dev/nst0
...

复制完所有tar文件后,创建一个新索引大小完全相同的.txt文件,并将其复制到磁带的开头:

After I've copied all the tar files, I create a new index.txt with the exact same size and copy it into the beginning of the tape:

mt -f /dev/nst0 rewind
dd bs=4k if=index.txt of=/dev/nst0

但是它破坏了其余数据。损坏是指如果我倒带并尝试从中读取,我只能读取index.txt文件,之后它无法再读取任何数据,并且 mt状态结果为:

But it corrupts rest of the data. By corrupt I mean if I rewind the tape and try to read from it, I can only read the index.txt file, after that it can't read any more data, and mt status results in:

SCSI 2 tape drive:
File number=1, block number=-1, partition=0.
Tape block size 0 bytes. Density code 0x5c (LTO-7).
Soft error count since last status=0
General status bits on (9010000):
 EOD ONLINE IM_REP_EN

在开始时,我虽然以某种方式破坏了index.txt末尾的EOF标记,所以我尝试仅编辑文件的开头:

At the beginning I though dd somehow ruined the EOF Mark at the end of the index.txt so I tried to edit only the beginning of the file:

dd conv=notrunc count=10 bs=4k if=index.txt of=/dev/nst4

之后的事情是,磁带中的第一个条目只有40K! (每10个块,每4k)

The wired thing is after that, my first entry in the tape will have only 40K! (10 blocks each 4k)

我在tape和dd命令的行为中是否缺少某些东西?

Am I missing something in behavior of the tape and dd command?

PS :数据作为对象存储在Ceph上,我需要下载它们,并且我没有足够的空间来存储1盘磁带

P.S:The data is stored on a Ceph as objects and I need to download them, and I don't have enough space to store 1 tape

推荐答案

我有相同的想法,并且遇到了相同的问题。我正在研究一个简单的磁带备份程序,该程序基本上是tar的包装程序,该程序的开头还包括一个目录,可以使用list函数进行检索。它还具有一个验证功能,可以检查档案中的文件是否仍与原始校验和相匹配,或者是否已损坏某些文件。

I had the same idea and I hit the same problem. I am working on a simple tape backup program, which is basically a wrapper for tar which also includes a table of contents at the beginning which can be retrieved using the list function. It also has a verify function to check if the files in the archive still match their original checksum or if something has been damaged.

我想实现真正的附加功能,但是令我惊讶的是,似乎无法阻止系统在开始更新TOC之后写入文件标记(在归档文件中的错误位置)。

I wanted to implement a real append function but to my surprise, it doesn't seem possible to prevent the system from writing a filemark (at the wrong position, within the archive) after updating the TOC at the beginning.

但是,我的备份程序(名称为 TOCTAR)也具有安全检查功能,可以防止管理员在未提供磁带文件索引选项的情况下覆盖磁带上的第一个存档。它还具有自动追加功能,该功能会尝试查找所有存档(由该程序创建),使它们保持不变,并在末尾创建一个新的磁带文件。也许对您的用例有用。如果发现任何错误或缺失,请随时打开Github问题。

However, my backup program, which goes by the name of "TOCTAR", also has a safety check that prevents the admin from overwriting the first archive on the tape if the tape file index option wasn't provided. It also has an auto-append feature, which tries to find all archives (which were created by that program), leaving them untouched and creates a new tape file at the end. Maybe it'll be useful for your use case. Feel free to open a Github issue if you find something that's wrong or missing.

简而言之:我没有设法在不删节的情况下覆盖一部分磁带。但是您可以在磁带上创建尽可能多的存档,而我的磁带备份程序 TOCTAR可能会帮助您。 (无耻的自我推广。)

In short: I haven't manage to overwrite a section of a tape without truncating it. But you could create as many archives on a tape as you can count and my tape backup program "TOCTAR" might help you with that. (Shameless self-promotion.)

我会在此处发布Github网址,但Github当前处于关闭状态(504网关超时)。真可悲的一天。

I'd post the Github url here, but Github is currently down (504 Gateway Time-out). What a sad day.

更新: https: //github.com/c0xc/toctar.git

这篇关于覆盖磁带上的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆