在RStudio EC2实例(Louis Aslett的AMI)中从S3读取大型JSON文件 [英] Reading large JSON files from S3 in RStudio EC2 instance (Louis Aslett's AMI)

查看:177
本文介绍了在RStudio EC2实例(Louis Aslett的AMI)中从S3读取大型JSON文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正遇到与此问题类似的问题此处

I am going through a similar issue as this question here:

我在AWS S3上有一个很大的JSON文件,并尝试通过RStudio(EC2实例)进行访问来自Louis Aslett的AMI)。
我什至尝试从t2迁移到具有30GB内存但无济于事的r4.xlarge:
我收到错误消息:

I have a big JSON file on AWS S3 and am trying to access it via RStudio (EC2 instance from Louis Aslett's AMI). I have even tried moving from t2 to r4.xlarge with 30GB of memory but to no avail: I receive errors:


writepin(httr :: content(r,as = raw),con = file)中的

错误:尚不支持长
向量:connections.c:4147

Error in writeBin(httr::content(r, as = "raw"), con = file) : long vectors not supported yet: connections.c:4147

如果我使用免费套餐实例,则会出现以下错误:

If I use the free tier instance then it gives me the error:


curl :: curl_fetch_memory(url,handle = handle)中的错误:
书写体失败(0!= 16360)

Error in curl::curl_fetch_memory(url, handle = handle) : Failed writing body (0 != 16360)

尽管我无法完全遵循它,但似乎我在参考文献中提到的问题已经找到了解决方法。当有人说目录需要不是 home时,可以请解释一下。您如何实施?因为在Louis AMI中没有这样做的权限。
这个问题可能是非常基本的,但是我在这里变得不明智。

It seems that the question that I mentioned in the reference has figured out a way to do it, though I am not able to follow it completely. Can someone please explain a little when they say that the directory needs to be something else than "home". How do you implement it? Because there is no permission to do that in Louis AMI. The question may be very basic but I am getting out of my wits here.

干杯!
A

Cheers! A

推荐答案

有人说目录不需要是其他东西时,可以解释一下吗? home。您如何实现它?因为没有在Louis AMI中进行此操作的权限。这个问题可能很基本,但是我在这里变得不明智。

"Can someone please explain a little when they say that the directory needs to be something else than "home". How do you implement it? Because there is no permission to do that in Louis AMI. The question may be very basic but I am getting out of my wits here."

我在这里对您表示同情,因为它与来自Windows IMHO的新Linux用户背道而驰,具有讽刺意味的是,我已经看到两个问题已回答这个问题,因为它们对于本高级论坛来说太基础了。但是您并不孤单,听起来很像是来自相同错误消息的个人经验,并且使用相同的AMI读取数据。

I am sympathetic to you here as it is counterintuitive to a new linux user coming from windows IMHO, ironically I have seen two questions answering this closed as they are considered too basic for this advanced forum. But you are not alone, it sounds like the same problem from personal experience of the same error message with reading in data with the same AMI.

如果您上传到其他位置在实例上驱动,那么很可能可以解决。由于Louis Aslett Rstudio AMI基于8-10GB的空间,因此您必须在该目录(主目录)之外设置工作目录。从Rstudio服务器界面上看不出来直观。

If you upload into a different drive on the instance then this can most likely be solved. As the Louis Aslett Rstudio AMI is based in this 8-10GB space then you will have to set your working directory outside this, the home directory. Not intuitively apparent from Rstudio server interface.

我建议您看一下其他目录(例如,在Rstudio中的RHS的Rstudio目录选择框中,或者在linux命令行)。然后将setwd()放在另一个目录(例如xda或具有足够空间的任何目录)中,然后尝试再次读入。

I would suggest to have a look at other directories (e.g. going up a few levels above home in the Rstudio directory selection box on the RHS in Rstudio or df command on the linux commandline). Then setwd() in another directory (e.g. xda or whatever has enough room) and try to read in again.

这篇关于在RStudio EC2实例(Louis Aslett的AMI)中从S3读取大型JSON文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆