如何处理R中的分层数据结构中的数据? [英] How to address data in a hierarchical data structure in R?

查看:123
本文介绍了如何处理R中的分层数据结构中的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个包含相同长度的两个数据列表(字符数组区域和列表 results )的列表。 (我试图管理data.frame中的数据,但将数据添加到data.frame似乎很复杂)。

I created a list containing two data lists (character array region and a list results) of the same length. (I tried to manage the data in data.frame, but it seems to be complicated to add data to a data.frame).

study = list(
    region  = character(),
    results = list()
)

study$region[1] = "Hamburg"
study$results[[1]]  = data.frame(month=c(1:5), maxTemp=c(-12, -1, 3, 10, 23))


study$region[2]    = "Bremen"
study$results[[2]]  = data.frame(month=c(1:5), maxTemp=c(-9, -1, 6, 10, 21))

str(study)

print("Maximum temperature of all study regions:")
max(study$results[[1:2]]$maxTemp)

我想找出所有地区的所有时间点的最高温度。我可以通过使用例如 max(study $ results [[1]] $ maxTemp ,但是当我尝试寻址所有地区时 max(study $ results [[1:2 ]] $ maxTemp 我收到一个错误:

I want to find out the maximum temperature of all timepoint of all regions. I can address each region after another by using e.g. max(study$results[[1]]$maxTemp, but when I try to address all regions max(study$results[[1:2]]$maxTemp I receive an error:


学习$结果[[1:2]]错误] $ maxTemp:

Error in study$results[[1:2]]$maxTemp :

$ operator对原子向量无效

$ operator is invalid for atomic vectors

我的错误?我如何处理保存在列表中的几个 data.frame code>列表?什么是原子向量?

Where is my mistake? How can I address fields of a several data.frames that are saved in a list of a list? And what are atomic vectors?

推荐答案

[[只能返回一个元素,我以为 [[/ code]会抛出一个错误,因为没有你看到的错误,但是阅读?[?[):

[[ can only return a single element. I thought [[ would have thrown an error because of that, not the error you are seeing, but reading ?"[" tells what R does with a call such as yours and explains the behaviour (from ?"["):


递归(列表样)对象:
....

Recursive (list-like) objects: ....

 ‘[[’ can be applied recursively to lists, so that if the single
 index ‘i’ is a vector of length ‘p’, ‘alist[[i]]’ is equivalent to
 ‘alist[[i1]]...[[ip]]’ providing all but the final indexing
 results in a list.


错误的原因是:

> study$results[[c(1,2)]]
[1] -12  -1   3  10  23

表示R真的是这样做

> study$results[[1]][[2]]
[1] -12  -1   3  10  23

ie返回作为原子向量的第一个数据帧的第二个组件(列),因为R丢弃了空的维度。 $ 不能用于原子向量,因此错误。

i.e. return the second component (column) of the first data frame, which is an atomic vector because R drops the empty dimension. $ can not be used on atomic vectors hence the error.

如果要迭代列表, study $ results lapply() sapply()是你的朋友:

If you want to iterate over the list that is study$results, lapply() or sapply() are your friends:

> lapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
[[1]]
[1] 23

[[2]]
[1] 21

> sapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
[1] 23 21

如果您在 $ results 中的组件上弹出名称,那么您也可以在输出中: p>

If you popped names on the components in $results you'd get them in the output too:

> names(study$results) <- study$region
> lapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
$Hamburg
[1] 23

$Bremen
[1] 21

> sapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
Hamburg  Bremen 
     23      21

哪个更容易使用,然后如果你愿意,你不需要 $ region 组件。

which is easier to use and then you don't need the $region component if you wish.

这篇关于如何处理R中的分层数据结构中的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆