在R中读取泡菜文件(PANDAS Python数据框架) [英] Reading a pickle file (PANDAS Python Data Frame) in R

查看:154
本文介绍了在R中读取泡菜文件(PANDAS Python数据框架)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种简单的方法可以将Pandas Dataframe中的泡菜文件(.pkl)读取到R中?

Is there an easy way to read pickle files (.pkl) from Pandas Dataframe into R?

一种可能性是导出为CSV并让R读取CSV,但这对我来说确实很麻烦,因为我的数据帧很大.有更简单的方法吗?

One possibility is to export to CSV and have R read the CSV but that seems really cumbersome for me because my dataframes are rather large. Is there an easier way to do so?

谢谢!

推荐答案

您可以在python中加载泡菜,然后通过python包rpy2(或类似软件包)将其导出到R中.完成此操作后,您的数据将存在于链接到python的R会话中.我怀疑您接下来要做的是使用该会话来调用R并将saveRDS保存到文件或RAM磁盘.然后,在RStudio中,您可以读回该文件.查看R包rJythonrPython,了解从R触发python命令的方式.

You could load the pickle in python and then export it to R via the python package rpy2 (or similar). Once you've done so, your data will exist in an R session linked to python. I suspect that what you'd want to do next would be to use that session to call R and saveRDS to a file or RAM disk. Then in RStudio you can read that file back in. Look at the R packages rJython and rPython for ways in which you could trigger the python commands from R.

或者,您可以编写一个简单的python脚本以将数据加载到Python中(可能使用上述R包之一),然后将格式化的数据流写入stdout.然后,对脚本的整个系统调用(包括指定泡菜的参数)可以用作R包data.tablefread的参数.另外,如果要保留标准功能,可以结合使用system(..., intern=TRUE)read.table.

Alternatively, you could write a simple python script to load your data in Python (probably using one of the R packages noted above) and write a formatted data stream to stdout. Then that entire system call to the script (including the argument that specifies your pickle) can use used as an argument to fread in the R package data.table. Alternatively, if you wanted to keep to standard functions, you could use combination of system(..., intern=TRUE) and read.table.

和往常一样,有/许多/种方法可以给这只猫咪剥皮.基本步骤是:

As usual, there are /many/ ways to skin this particular cat. The basic steps are:

  1. 在python中加载数据
  2. 将数据表达给R(例如,通过rpy2导出对象或将格式化的文本写到stdout上,而R可以在另一端接收它)
  3. 将R中表示的数据序列化为内部数据表示形式(例如,通过rpy2或fread导出对象)
  4. (可选)使R的该会话中的数据可被另一个R会话访问(即,用rpy2关闭循环的步骤,或者如果您一直在使用fread,则说明您已经完成). /li>
  1. Load the data in python
  2. Express the data to R (e.g., exporting the object via rpy2 or writing formatted text to stdout with R ready to receive it on the other end)
  3. Serialize the expressed data in R to an internal data representation (e.g., exporting the object via rpy2 or fread)
  4. (optional) Make the data in that session of R accessible to another R session (i.e., the step to close the loop with rpy2, or if you've been using fread then you're already done).

这篇关于在R中读取泡菜文件(PANDAS Python数据框架)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆