如何使用其他变量值和序列有条件地创建类别 [英] how to create categories conditionally using other variables values and sequence
本文介绍了如何使用其他变量值和序列有条件地创建类别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我会很高兴创建一个函数,该函数允许我使用一组其他变量值的顺序来创建一个变量的类别。
I would appreciate any help to create a function that allows me to create categories of one variable using the order of a set of other variables values.
具体来说,我想要一个函数:
Specifically, I want a function that:
- 创建类别
E1
变量
的第一次
的时间,变量的每个值组合A
,B
和ID
出现在数据集中。 - 创建变量
变量
E2
变量A
,B
和ID
出现在数据集中。 - 创建变量
变量
E3
变量A
,B
和ID
出现在数据集中。 - 创建变量
变量
的类别En
变量A
,B
和ID
出现在数据集中。
- creates category
E1
of the variablevariable
thefirst
time that each combination of values of the variablesA
,B
, andID
appears in the dataset. - creates category
E2
of the variablevariable
thesecond
time that each combination of values of the variablesA
,B
, andID
appears in the dataset. - creates category
E3
of the variablevariable
thethird
time that each combination of values of the variablesA
,B
, andID
appears in the dataset. - creates category
En
of the variablevariable
thenth
time that each combination of values of the variablesA
,B
, andID
appears in the dataset.
#sample数据:
rowdT<-structure(list(A = c("a1", "a2", "a1", "a1", "a2", "a1", "a1",
"a2", "a1"), B = c("b2", "b2", "b2", "b1", "b2", "b2", "b1",
"b2", "b1"), ID = c("3", "4", "3", "1", "4", "3", "1", "4", "1"
), E = c(0.621142094943352, 0.742109450696123, 0.39439152996948,
0.40694392882818, 0.779607277916503, 0.550579323666347, 0.352622183880119,
0.690660491345867, 0.23378944873769)), class = c("data.table",
"data.frame"), row.names = c(NA, -9L))
sampleDT <- melt(rowdT, id.vars = c("A", "B", "ID"))
#输入数据:
A B ID variable value
1: a1 b2 3 E 0.6211421
2: a2 b2 4 E 0.7421095
3: a1 b2 3 E 0.3943915
4: a1 b1 1 E 0.4069439
5: a2 b2 4 E 0.7796073
6: a1 b2 3 E 0.5505793
7: a1 b1 1 E 0.3526222
8: a2 b2 4 E 0.6906605
9: a1 b1 1 E 0.2337894
#预期输出:
A B ID variable value
4: a1 b1 1 E1 0.4069439
1: a1 b2 3 E1 0.6211421
2: a2 b2 4 E1 0.7421095
7: a1 b1 1 E2 0.3526222
3: a1 b2 3 E2 0.3943915
5: a2 b2 4 E2 0.7796073
9: a1 b1 1 E3 0.2337894
6: a1 b2 3 E3 0.5505793
8: a2 b2 4 E3 0.6906605
在此先感谢您的帮助。
推荐答案
首先转换yo将ur变量转换为字符向量以进行适当的强制,然后使用 data.table
First convert your variable to a character vector for proper coercion, and then use data.table
sampleDT$variable = as.character(sampleDT$variable)
sampleDT[, variable := paste(variable,1:.N,sep = ""), by = c("A", "B", "ID")]
这会根据观察到的 A
, B
和 ID
。
This creates unique tallies based on the observed combinations of A
, B
, and ID
.
这将获得以下输出:
A B ID variable value
1: a1 b2 3 E1 0.6211421
2: a2 b2 4 E1 0.7421095
3: a1 b2 3 E2 0.3943915
4: a1 b1 1 E1 0.4069439
5: a2 b2 4 E2 0.7796073
6: a1 b2 3 E3 0.5505793
7: a1 b1 1 E2 0.3526222
8: a2 b2 4 E3 0.6906605
9: a1 b1 1 E3 0.2337894
您可以根据需要重新排序。
which you can reorder if necessary.
这篇关于如何使用其他变量值和序列有条件地创建类别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文