如何将向量分组为向量列表? [英] How to group a vector into a list of vectors?
问题描述
我有一些看起来像这样的数据(例如假数据):
dressId 颜色6黄色9红10绿色10紫10黄色12紫12红
其中颜色是一个因子向量.不保证所有可能的因子水平都实际出现在数据中(例如,蓝色"颜色也可能是其中的一个水平).
我需要一个向量列表,用于对每件衣服的可用颜色进行分组:
<代码>[[1]]黄色的[[2]]红色的[[3]]绿色 紫色 黄色[[4]]紫红色
保留衣服的 ID 会很好(例如,此列表是第二列而 ID 是第一列的数据框),但不是必需的.
我写了一个循环,它遍历数据帧行,虽然下一个 ID 相同,但它将颜色添加到向量中.(我确定数据是按 ID 排序的).当第一列中的 ID 发生变化时,它会将向量添加到列表中:
result <- NULL同时(等等){一些创建称为颜色"的向量的代码结果[[dressCounter]] <- 颜色着装计数器 <- 着装计数器 + 1}
在努力使所有必要的计数变量正确后,我沮丧地发现它不起作用.第一次,colors
是
[1] 黄色层次:绿黄紫红蓝
它被强制转换为一个整数,所以 result
变成 2
.
在第二次循环中,colors
只包含红色,result
变成了一个简单的整数向量,[1] 2 4
.
在第三次重复中,colors
现在是一个向量,
[1] 绿色 紫色 黄色层次:绿黄紫红蓝
我得到
result[[3]] <- 颜色
<块引用>
结果错误[[3]] <-颜色:
提供的元素多于替换的元素
我做错了什么?有没有办法初始化 result
这样它就不会被转换成数字向量,而是变成一个向量列表?
此外,除了自己动手"之外,还有其他方法可以完成整个事情吗?
split.data.frame
是一个很好的组织方式;然后提取颜色分量.
d <- data.frame(dressId=c(6,9,10,10,10,12,12),颜色=因子(c(黄色",红色",绿色",紫色",黄色","紫色","红色"),水平= c(红色",橙色",黄色",绿色"、蓝色"、紫色")))
我认为你想要的版本实际上是这样的:
ss <- split.data.frame(d,d$dressId)
通过提取颜色分量,您可以获得更像您请求的列表的内容:
lapply(ss,"[[","color")
I have some data which looks like this (fake data for example's sake):
dressId color
6 yellow
9 red
10 green
10 purple
10 yellow
12 purple
12 red
where color is a factor vector. It is not guaranteed that all possible levels of the factor actually appear in the data (e.g. the color "blue" could also be one of the levels).
I need a list of vectors which groups the available colors of each dress:
[[1]]
yellow
[[2]]
red
[[3]]
green purple yellow
[[4]]
purple red
Preserving the IDs of the dresses would be nice (e.g. a dataframe where this list is the second column and the IDs are the first), but not necessary.
I wrote a loop which goes through the dataframe row for row, and while the next ID is the same, it adds the color to a vector. (I am sure that the data is sorted by ID). When the ID in the first column changes, it adds the vector to a list:
result <- NULL
while(blah blah)
{
some code which creates the vector called "colors"
result[[dressCounter]] <- colors
dressCounter <- dressCounter + 1
}
After wrestling with getting all the necessary counting variables correct, I found out to my dismay that it doesn't work. The first time, colors
is
[1] yellow
Levels: green yellow purple red blue
and it gets coerced into an integer, so result
becomes 2
.
In the second loop repetition, colors
only contains red, and result
becomes a simple integer vector, [1] 2 4
.
In the third repetition, colors
is a vector now,
[1] green purple yellow
Levels: green yellow purple red blue
and I get
result[[3]] <- colors
Error in result[[3]] <- colors :
more elements supplied than there are to replace
What am I doing wrong? Is there a way to initialize result
so it doesn't get converted into a numeric vector, but becomes a list of vectors?
Also, is there another way to do the whole thing than "roll my own"?
split.data.frame
is a good way to organize this; then extract the color component.
d <- data.frame(dressId=c(6,9,10,10,10,12,12),
color=factor(c("yellow","red","green",
"purple","yellow",
"purple","red"),
levels=c("red","orange","yellow",
"green","blue","purple")))
I think the version you want is actually this:
ss <- split.data.frame(d,d$dressId)
You can get something more like the list you requested by extracting the color component:
lapply(ss,"[[","color")
这篇关于如何将向量分组为向量列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!