如何处理R中的分层数据结构中的数据? [英] How to address data in a hierarchical data structure in R?
问题描述
我创建了一个包含相同长度的两个数据列表(字符数组区域
和列表 results
)的列表。 (我试图管理data.frame中的数据,但将数据添加到data.frame似乎很复杂)。
I created a list containing two data lists (character array region
and a list results
) of the same length. (I tried to manage the data in data.frame, but it seems to be complicated to add data to a data.frame).
study = list(
region = character(),
results = list()
)
study$region[1] = "Hamburg"
study$results[[1]] = data.frame(month=c(1:5), maxTemp=c(-12, -1, 3, 10, 23))
study$region[2] = "Bremen"
study$results[[2]] = data.frame(month=c(1:5), maxTemp=c(-9, -1, 6, 10, 21))
str(study)
print("Maximum temperature of all study regions:")
max(study$results[[1:2]]$maxTemp)
我想找出所有地区的所有时间点的最高温度。我可以通过使用例如 max(study $ results [[1]] $ maxTemp
,但是当我尝试寻址所有地区时 max(study $ results [[1:2 ]] $ maxTemp
我收到一个错误:
I want to find out the maximum temperature of all timepoint of all regions. I can address each region after another by using e.g. max(study$results[[1]]$maxTemp
, but when I try to address all regions max(study$results[[1:2]]$maxTemp
I receive an error:
学习$结果[[1:2]]错误] $ maxTemp:
Error in study$results[[1:2]]$maxTemp :
$ operator对原子向量无效
$ operator is invalid for atomic vectors
我的错误?我如何处理保存在列表
中的几个 data.frame
code>列表?什么是原子向量?
Where is my mistake? How can I address fields of a several data.frame
s that are saved in a list
of a list
? And what are atomic vectors?
推荐答案
[[
只能返回一个元素,我以为 [[/ code]会抛出一个错误,因为没有你看到的错误,但是阅读
):?[
?[
[[
can only return a single element. I thought [[
would have thrown an error because of that, not the error you are seeing, but reading ?"["
tells what R does with a call such as yours and explains the behaviour (from ?"["
):
递归(列表样)对象:
....
Recursive (list-like) objects: ....
‘[[’ can be applied recursively to lists, so that if the single
index ‘i’ is a vector of length ‘p’, ‘alist[[i]]’ is equivalent to
‘alist[[i1]]...[[ip]]’ providing all but the final indexing
results in a list.
错误的原因是:
> study$results[[c(1,2)]]
[1] -12 -1 3 10 23
表示R真的是这样做
> study$results[[1]][[2]]
[1] -12 -1 3 10 23
ie返回作为原子向量的第一个数据帧的第二个组件(列),因为R丢弃了空的维度。 $
不能用于原子向量,因此错误。
i.e. return the second component (column) of the first data frame, which is an atomic vector because R drops the empty dimension. $
can not be used on atomic vectors hence the error.
如果要迭代列表, study $ results
, lapply()
或 sapply()
是你的朋友:
If you want to iterate over the list that is study$results
, lapply()
or sapply()
are your friends:
> lapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
[[1]]
[1] 23
[[2]]
[1] 21
> sapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
[1] 23 21
如果您在 $ results
中的组件上弹出名称,那么您也可以在输出中: p>
If you popped names on the components in $results
you'd get them in the output too:
> names(study$results) <- study$region
> lapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
$Hamburg
[1] 23
$Bremen
[1] 21
> sapply(study$results, function(y) max(y[, "maxTemp"], na.rm = TRUE))
Hamburg Bremen
23 21
哪个更容易使用,然后如果你愿意,你不需要 $ region
组件。
which is easier to use and then you don't need the $region
component if you wish.
这篇关于如何处理R中的分层数据结构中的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!