数据传输困境 [英] Data Transfer Dilemma

查看:67
本文介绍了数据传输困境的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

希望这是一个正确的论坛,它似乎并不适合任何其他论坛。


我的公司有20个远程站点,可以将日期备份到我们的主楼。

目前,设置了一个预定的批处理作业,即zips(tar + gnu)all将数据放在一个大文件中,然后将zip文件FTP到主要位置。每个位置的数据大小从50MB到1.5GB不等,总共12 GB。


随着时间的推移,数据的大小将继续增长。

每天只有1-5%的文件会实际更改,但这些文件可能占传输总大小的20-25%。


对我来说,带来其余未改变的文件似乎是浪费。


为什么我来到这里:我希望得到一些关于事件序列和/或程序的建议来实现只是改变已经改变的文件。


我希望在远程站点和我的位置的备份目录中有一个压缩文件的副本,只需要在最短的时间内传输尽可能少的额外数据。


这应该很简单?

数据目录? (同步) - >备份目录

备份目录? (同步) - > FTP站点

让备份目录和FTP站点同步压缩文件会很好,因此文件会通过互联网进行压缩。

是否可以通过ftp(sftp等)类型连接比较压缩档案,只将文件从一个档案发送到另一个档案,哪些已经更改?


辅助选项可能不是创建一个大的zip文件,而是将所有文件作为单个档案压缩在备份目录中,然后同步?


任何想法都将不胜感激。


谢谢!

Adam

解决方案

通过FTP设施不知道,但是应该可以检查存档中文件的CRC,以便在不必转移整个存档的情况下进行比较。


嗯...这个怎么样?


除了存档(zip)文件外,您可以存储一个列出存档内容的文本文件。这可以通过FTP进行检查,并进行检查以确定还有什么工作要做。


理想情况下,当然,您希望在发送方面尽可能多地发生(例如,如果需要提取/重新压缩任何内容),以尽量减少流量。你可能最好不要去寻找真正的备份软件而不是自己动手。


也许你可以做一周一次的每日增量zip文件? Zip实用程序通常允许您只添加那些已被触摸的文件。


感谢您的回复!


当然,理想情况下,您希望在发送方尽可能多地发生(例如,如果需要提取/重新压缩任何内容) ),以尽量减少流量。你可能最好不要寻找真正的备份软件,而不是自己动手。



所有远程站点都使用一种备份软件或另一种备份软件。协调备份周围的所有转移是另一个有趣的方面?项目。此备份用于灾难恢复。为了提供更多细节,所有远程站点都进行不同的会计相关工作。一些远程站点打印工资单检查,一些接受付款等。如果在正确的一天发生灾难,它可能会影响按时退出的工资单检查,此备份旨在最大限度地减少这种机会。


也许你可以只做每周一次的每日增量zip文件? Zip实用程序通常允许您仅添加已触摸的文件。



我一直在玩我在网上找到的名为SyncronEx的实用工具?

它会将一组备份目录单向同步到另一个位置。尽管它最终会成为远程站点和ftp站点上的备份目录中的完整备份,但它是增量的,因为它只会将文件移动到其他2个已更改的位置。我对程序的主要问题是,一般来说,由于传输中可能涉及的时间因素,我觉得创建辅助目录并同步它会更舒服。


< blockquote>就个人而言,我会在每个远程站点上保留一份备份到中央站点的文件的当前状态(保留单独的文件,粒度越高,效率越高)。

每次安排备份时,它都会将当前文件与中心站点上的内容(此处没有使用带宽)重复的文件进行比较,仅删除和复制已更改的文件。 />
或者,在比较之前将它们全部拉出来以减少浪费本地空间。


Hopefully this is the right forum, it didn''t seem to quite fit into any of the others.

My company has 20 remote sites that backup date to our main building.
Presently, a scheduled batch job is setup that zips (tar+gnu) all data in one large file, then FTP the zip file to the main location. The size of the data coming over ranges from 50MB to 1.5GB per location, 12 GB total.

The size of the data will continue to grow as time goes on.
On a day to day basis, only 1-5% of the files coming over will have actually changed, but those files can be 20-25% of the total size of the transfer.

For me, it seemed like a waste to bring over the rest of the files that haven?t changed.

Why I came here: I?m hoping to get some advice about sequence of events and/or programs to accomplish just bring over the files that have changed.

I hope to have a zipped copy of the files in a backup directory at the remote site and at my location, with as little extra data transferred, in the least amount of time, possible.

It should be pretty simple?
Data Directories ? (sync) -> Backup Directory
Backup Directory ? (sync) -> FTP Site
It would be nice to have the Backup Directory and FTP Site sync zipped files instead, so the files travel across the internet compressed.
Is it possible to compare zipped archives through an ftp(sftp, etc) type connect and only send the files out of one archive to the other, that have changed?

A secondary option could be instead of creating one large zip file, zip all the files as individual archives in the backup directory, then sync?

Any thoughts would be greatly appreciated.

Thanks!
Adam

解决方案

Don''t know about through the FTP facility, but it should be possible to check the CRC of files within the archive, in order to get a comparison without having to transfer the entire archive.

Hm... how about this?

Along with the archive (zip) file, perhaps you could store a text file listing the contents of the archive. This could be FTP''d across, and examined to determine what else has to be done.

Ideally, of course, you want as much as possble to happen on the sending side (for example if anything needs to be extracted/re-zipped), to minimise traffic. You might be better off looking into real backup software rather than rolling your own.

Maybe you could just do a weekly full and daily incremental zip files? Zip utilities usually allow you to add only those files which have been touched.


Thank You for the response!

Ideally, of course, you want as much as possible to happen on the sending side (for example if anything needs to be extracted/re-zipped), to minimize traffic. You might be better off looking into real backup software rather than rolling your own.

All the remote sites use one kind of backup software or another already. Coordinating all the transferred around the backups is another fun ?side? project . This backup is meant for disaster recovery. Just to give a little more detail, all the remote sites do different accounting related work. Some remote sites print payroll checks, some accept payments, etc. If a disaster struck on the right day, it could affect the payroll checks getting out on time, this backup is meant to minimize that chance.

Maybe you could just do a weekly full and daily incremental zip files? Zip utilities usually allow you to add only those files which have been touched.

I?ve been playing around with a utility I found on the web called ?SyncronEx?
It will do a one-way sync of a set of backup directories to another location. Even though it would end up being a full backup in both the backup directory at the remote site and on the ftp site, it is incremental in the sense that it?s only moves the files to 2 other locations that have changed. The main problem I have with the program, and in general with this is I would feel more comfortable creating a secondary directory and syncing off that because of the time factor that can be involved in the transfer.


Personally, I''d maintain a copy on each remote site of the current status of the files backed up to the central site (Keep separate files, the higher the granularity, the better the efficiency).
Every time a backup is scheduled, it will compare the current file with the file that is a duplicate of what is on the central site (No bandwidth used in this), only Zipping and copying those files which have been changed.
Alternatively, Zip them all before the comparison to waste less local space.


这篇关于数据传输困境的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆