如何将数据集放入R包中 [英] How to put datasets into an R package

查看：636 发布时间：2017/4/2 12:10:55 r dataset r-package

本文介绍了如何将数据集放入R包中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在创建自己的R包，我想知道我可以用什么方法来添加（时间序列）数据集到我的包中。以下是具体细节：

我已经创建了一个名为数据的包子目录，我知道这是我应该保存数据集的位置我想添加到我的包中。我也认识到包含数据的文件可能是这样的：rda ， .txt 或 .csv 文件。

我要添加到包中的每一系列数据都包含一列数字（例如，形式为340或4.5），每个数据系列不同

到目前为止，我已将所有数据集保存到一个 .txt 文件中。我也使用 data（）函数成功加载数据。但是问题没有解决。

问题是每个系列的数据都作为一个因素加载，除了长度最大的系列之外。作为因素加载的系列包含缺少的值（。形式）。我不得不添加这些缺少的值，以使每列数据的长度相同。我尝试将数据保存为不等的列，但是在调用 data（）之后收到错误消息。

将缺省值添加到获取数据加载是一旦数据加载，我需要删除NA，以便我的分析数据！所以，这显然不是一个很好的办法。

理想情况下（我想），我希望将数据作为数字向量或列表加载。这样，我不需要在每个系列的末尾附加NA。

如何解决这个问题？我应该将所有数据保存到一个文件中吗？如果是这样，我应该采用什么格式？也许我应该将数据集保存到多个文件中？再一次，在哪种格式？这样做最好的实践方法是什么？任何提示都将不胜感激。

解决方案

我不知道我是否正确理解您的问题。但是，如果您以最喜欢的格式编辑数据并使用

  save（myediteddata，file =data.rda）保存，

数据应该按照您在R中看到的方式加载。

要加载数据目录中的所有文件，您应该添加

  LazyData：true

在您的包中的DESCRIPTION文件。

如果不要帮你，你可以发布你的一个文件，并打印你想要的格式，这将有助于我们帮助你;）

I am creating my own R package and I was wondering what are the possible methods that I can use to add (time-series) datasets to my package. Here are the specifics:

I have created a package subdirectory called data and I am aware that this is the location where I should save the datasets that I want to add to my package. I am also cognizant of the fact that the files containing the data may be .rda, .txt, or .csv files.

Each series of data that I want to add to the package consists of a single column of numbers (eg. of the form 340 or 4.5) and each series of data differs in length.

So far, I have saved all of the datasets into a .txt file. I have also successfully loaded the data using the data() function. Problem not solved, however.

The problem is that each series of data loads as a factor except for the series greatest in length. The series that load as factors contain missing values (of the form '.'). I had to add these missing values in order to make each column of data the same in length. I tried saving the data as unequal columns, but I received an error message after calling data().

A consequence of adding missing values to get the data to load is that once the data is loaded, I need to remove the NA's in order to get on with my analysis of the data! So, this clearly is not a good way of doing things.

Ideally (I suppose), I would like the data to load as numeric vectors or as a list. In this way, I wouldn't need the NA's appended to the end of each series.

How do I solve this problem? Should I save all of the data into one single file? If so, in what format should I do it? Perhaps I should save the datasets into a number of files? Again, in which format? What is the best practical way of doing this? Any tips would greatly be appreciated.

解决方案

I'm not sure if I understood your question correctly. But, if you edit your data in your favorite format and save with

save(myediteddata, file="data.rda")

The data should be loaded exactly the way you saw it in R.

To load all files in data directory you should add

LazyData: true

To your DESCRIPTION file, in your package.

If this don't help you could post one of your files and a print of the format you want, this will help us to help you ;)

这篇关于如何将数据集放入R包中的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何将数据集放入R包中 [英] How to put datasets into an R package

问题描述

相关文章

其他数据库最新文章

热门教程

热门工具

登录关闭

如何将数据集放入R包中 [英] How to put datasets into an R package

问题描述

相关文章

其他数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭