rsync文件到hadoop [英] rsync files to hadoop

查看:78
本文介绍了rsync文件到hadoop的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有6台服务器,每台服务器包含很多日志.我想通过rsync将这些日志放到hadoop fs中.现在,我使用保险丝,而rsync直接写入安装在保险丝上的fs/mnt/hdfs.但是有一个大问题.大约一天后,保险丝恶魔占用了5 GB的RAM,并且在安装了fs的情况下无法执行任何操作.因此,我必须重新安装保险丝,一切正常,但这只是一段时间.Rsync命令是

I have 6 servers and each contains a lot of logs. I'd like to put these logs to hadoop fs via rsync. Now I'm using fuse and rsync writes directly to fuse-mounted fs /mnt/hdfs. But there is a big problem. After about a day, the fuse deamon occupies 5 GB of RAM and it's not possible to do anything with mounted fs. So I have to remount fuse and everything is OK, but just for some time. Rsync command is

rsync --port=3360 -az --timeout=10 --contimeout=30 server_name::ap-rsync/archive /mnt/hdfs/logs

一段时间后,Rsync会产生错误消息:

Rsync produces error message after some time:

rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.7]
rsync: connection unexpectedly closed (498784 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [receiver=3.0.7]
rsync: connection unexpectedly closed (498658 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(601) [generator=3.0.7]

推荐答案

Fuse-hdfs不支持 O_RDWR O_EXCL ,因此rsync会收到EIO错误.如果要将rsync与fuse-hdfs一起使用,则需要对代码进行修补.您有两种修改方法,每种都可以.我建议使用第二种方法.

Fuse-hdfs does not support O_RDWR and O_EXCL, so rsync get a EIO error. If you want to use rsync with fuse-hdfs, it is needed to patch the code. You have two ways to modify, each one is OK. I recommend to use the second method.

  1. 修补保险丝-hdfs,可以在hadoop中找到.

  1. patch fuse-hdfs, it could be find in hadoop.

https://issues.apache.org/jira/browse/HDFS-861

修补rsync(版本3.0.8).

patch rsync (version 3.0.8).

diff -r rsync-3.0.8.no_excl/syscall.c rsync-3.0.8/syscall.c

234a235,252
> #if defined HAVE_SECURE_MKSTEMP && defined HAVE_FCHMOD && (!defined HAVE_OPEN64 || defined HAVE_MKSTEMP64)
>   {
>       int fd = mkstemp(template);
>       if (fd == -1)
>           return -1;
>       if (fchmod(fd, perms) != 0 && preserve_perms) {
>           int errno_save = errno;
>           close(fd);
>           unlink(template);
>           errno = errno_save;
>           return -1;
>       }
> #if defined HAVE_SETMODE && O_BINARY
>       setmode(fd, O_BINARY);
> #endif
>       return fd;
>   }
> #else
237c255,256
<   return do_open(template, O_WRONLY|O_CREAT, perms);
---
>   return do_open(template, O_RDWR|O_EXCL|O_CREAT, perms);
> #endif

这篇关于rsync文件到hadoop的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆