获得[否剩余空间设备和QUOT;约。 10 GB的电子病历m1.large实例数据 [英] Getting "No space left on device" for approx. 10 GB of data on EMR m1.large instances

查看:150
本文介绍了获得[否剩余空间设备和QUOT;约。 10 GB的电子病历m1.large实例数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到一个错误没有剩余空间的设备时,我使用m1.large作为实例类型必须由jobflow创建Hadoop的实例上运行我的亚马逊电子病历工作。这项工作产生约。 10 GB的在最大和由于m1.large实例的容量数据的应该是420GB * 2(根据: EC2实例类型的)。我很困惑10GB的数据怎么只是可能导致磁盘空间已满样的消息。我知道的可能性,也可以产生这种错误的,如果我们已经完全耗尽允许在文件系统索引节点的总数,但是这就像一个大数目达几百万,我pretty的肯定,我的作业不会产生很多的文件。我已经看到,当我尝试在默认情况下独立m1.large的创建一个EC2实例类型,其分配8GB的根卷吧。难道这是实例的电子病历还配置背后的原因是什么?那么,什么时候大小420GB的硬盘获得获分配到一个实例?

此外,这里是东风-hi和摩的

输出

$ DF -hi
文件系统索引节点IUsed IFree IUSE%挂载
为/ dev / xvda1 640K 100K 541K 16%/
tmpfs的932K 3 932K 1%/ lib中/初始化/ RW
udev的930K 454 929K 1%的/ dev
tmpfs的932K 3 932K 1%的/ dev / shm的
IP-10-182-182-151.ec2.internal:/ MAPR
                        100G 50G 50G 50%/ MAPR

$坐骑
为/ dev / xvda1上/ EXT3型(RW,noatime的)
在/ lib中/初始化/ RW类型的tmpfs tmpfs的(RW采用这个选项,模式= 0755)
PROC上的/ proc类型PROC(RW,NOEXEC采用这个选项,为nodev)
在/ SYS类型的sysfs sysfs的(RW,NOEXEC采用这个选项,为nodev)
的udev在/ dev类型的tmpfs(RW,模式= 0755)
在/ dev / shm的类型的tmpfs tmpfs的(RW采用这个选项,为nodev)
在/ dev devpts /其中pts型devpts(RW,NOEXEC采用这个选项,GID = 5,模式= 620)
在/ var /上/运行类型无运行(RW,绑定)
在/ var /锁定/运行/锁式无(RW,绑定)
的/ dev / shm的开启/运行/ SHM型无(RW,绑定)
上的/ var / lib目录/ NFS / rpc_pipefs型rpc_pipefs(RW)rpc_pipefs
IP-10-182-182-151.ec2.internal:上/ MAPR类型NFS / MAPR(RW,地址= 10.182.182.151)


$ lsblk
名少校:分企业规模RO类型MOUNTPOINT
xvda1 202:1 0 0 10G硬盘/
xvdb 202:16 0 420G 0磁盘
xvdc 202:32 0 420G 0磁盘

解决方案

随着@slayedbylucifer的帮助下,我能够确定的问题是,整个磁盘空间提供给HDFS集群上默认。因此,存在的空间的默认10GB安装在/可用于由机器本地使用。有一个名为选项 - MFS-个可用于(同时使用Hadoop的MAPR分布)指定的磁盘空间的本地文件系统和HDFS之间的分裂。它安装在本地文件系统的配额是 / var / tmp目录。请确保选择 MA pred.local.dir 设置为一个目录内的 / var / tmp目录因为这是对的TaskTracker尝试的所有日志进去可以是体积庞大的大的工作。在我的情况的记录是导致磁盘空间的错误。我设置的值 - 。MFS-个 60,并能够成功后运行作业

I am getting an error "No space left on device" when I am running my Amazon EMR jobs using m1.large as the instance type for the hadoop instances to be created by the jobflow. The job generates approx. 10 GB of data at max and since the capacity of a m1.large instance is supposed to be 420GB*2 (according to: EC2 instance types ). I am confused how just 10GB of data could lead to a "disk space full" kind of a message. I am aware of the possibility that this kind of an error can also be generated if we have completely exhausted the total number of inodes allowed on the filesystem but that is like a big number amounting to millions and I am pretty sure that my job is not producing that many files. I have seen that when I try to create an EC2 instance independently of m1.large type it by default assigns a root volume of 8GB to it. Could this be the reason behind the provisioning of instances in EMR also? Then, when do the disks of size 420GB get alloted to an instance?

Also, here is the output of of "df -hi" and "mount"

$ df -hi
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/xvda1              640K    100K    541K   16% /
tmpfs                   932K       3    932K    1% /lib/init/rw
udev                    930K     454    929K    1% /dev
tmpfs                   932K       3    932K    1% /dev/shm
ip-10-182-182-151.ec2.internal:/mapr
                        100G     50G     50G   50% /mapr

$ mount
/dev/xvda1 on / type ext3 (rw,noatime)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/var/run on /run type none (rw,bind)
/var/lock on /run/lock type none (rw,bind)
/dev/shm on /run/shm type none (rw,bind)
rpc_pipefs on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
ip-10-182-182-151.ec2.internal:/mapr on /mapr type nfs (rw,addr=10.182.182.151)


$ lsblk
NAME  MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
xvda1 202:1    0    10G  0 disk /
xvdb  202:16   0   420G  0 disk 
xvdc  202:32   0   420G  0 disk

解决方案

With the help of @slayedbylucifer I was able to identify the problem was that the complete disk space is made available to the HDFS on the cluster by default. Hence, there is the default 10GB of space mounted on / available for local use by the machine. There is an option called --mfs-percentage which can be used (while using MapR distribution of Hadoop) to specify the split of disk space between the local filesystem and HDFS. It mounts the local filesystem quota at /var/tmp. Make sure that the option mapred.local.dir is set to a directory inside /var/tmp because that is where all the logs of the tasktracker attempts go in which can be huge in size for big jobs. The logging in my case was causing the disk space error. I set the value of --mfs-percentage to 60 and was able to run the job successfully thereafter.

这篇关于获得[否剩余空间设备和QUOT;约。 10 GB的电子病历m1.large实例数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆