如何使用两个规则从一个变量创建新变量 [英] how to create new variables from one variable using two rules
问题描述
希望能从一个变量中创建新变量。
I would appreciate any help to create new variables from one variable.
特别是,我需要帮助同时为每个 ID创建一行
和 E
的各个列,其中 E
的每个新列(是, E1
, E2
, E3
)包含值 ID
的每一行的 E
。我尝试这样做,融化
,然后传播
,但出现错误:
Specifically, I need help to simultaneously create one row per each ID
and various columns of E
, where each of the new columns of E
, (that is, E1
, E2
, E3
) contains the values of E
for each row of ID
. I tried doing this which melt
followed by spread
but I am getting the error:
错误:行(4、7、9),(1、3、6),(2、5、8)的标识符重复
Error: Duplicate identifiers for rows (4, 7, 9), (1, 3, 6), (2, 5, 8)
另外,我尝试了此处和此处,但这些对我而言不起作用,因为我需要能够为行(4,1,2),(()创建行标识符
7,3,5)和(9,6,8)。也就是说,第(4、1、2)行的 E
应该命名为 E1
,行(7、3、5)的E
应该命名为 E2
, E
行(9、6、8)应该命名为 E3
,依此类推。
Additionally, I tried the solutions discussed here and here but these did not work for my case because I need to be able to create row identifiers
for rows (4, 1, 2), (7, 3, 5), and (9, 6, 8). That is, E
for rows (4, 1, 2) should be named E1
, E
for rows (7, 3, 5) should be named E2
, E
for rows (9, 6, 8) should be named E3
, and so on.
#data
dT<-structure(list(A = c("a1", "a2", "a1", "a1", "a2", "a1", "a1",
"a2", "a1"), B = c("b2", "b2", "b2", "b1", "b2", "b2", "b1",
"b2", "b1"), ID = c("3", "4", "3", "1", "4", "3", "1", "4", "1"
), E = c(0.621142094943352, 0.742109450696123, 0.39439152996948,
0.40694392882818, 0.779607277916503, 0.550579323666347, 0.352622183880119,
0.690660491345867, 0.23378944873769)), class = c("data.table",
"data.frame"), row.names = c(NA, -9L))
#我的尝试
A B ID E
1: a1 b2 3 0.6211421
2: a2 b2 4 0.7421095
3: a1 b2 3 0.3943915
4: a1 b1 1 0.4069439
5: a2 b2 4 0.7796073
6: a1 b2 3 0.5505793
7: a1 b1 1 0.3526222
8: a2 b2 4 0.6906605
9: a1 b1 1 0.2337894
aTempDF <- melt(dT, id.vars = c("A", "B", "ID")) )
A B ID variable value
1: a1 b2 3 E 0.6211421
2: a2 b2 4 E 0.7421095
3: a1 b2 3 E 0.3943915
4: a1 b1 1 E 0.4069439
5: a2 b2 4 E 0.7796073
6: a1 b2 3 E 0.5505793
7: a1 b1 1 E 0.3526222
8: a2 b2 4 E 0.6906605
9: a1 b1 1 E 0.2337894
aTempDF%>%spread(variable, value)
Error: Duplicate identifiers for rows (4, 7, 9), (1, 3, 6), (2, 5, 8)
#预期输出
A B ID E1 E2 E3
1: a1 b2 3 0.6211421 0.3943915 0.5505793
2: a2 b2 4 0.7421095 0.7796073 0.6906605
3: a1 b1 1 0.4069439 0.3526222 0.2337894
谢谢您的帮助。
推荐答案
您可以使用 data.table
You can use dcast
from data.table
library(data.table)
dcast(dT, A + B + ID ~ paste0("E", rowid(ID)))
# A B ID E1 E2 E3
#1 a1 b1 1 0.4069439 0.3526222 0.2337894
#2 a1 b2 3 0.6211421 0.3943915 0.5505793
#3 a2 b2 4 0.7421095 0.7796073 0.6906605
您需要先创建正确的时间变量,即 rowid( ID)
。
You need to create the correct 'time variable' first which is what rowid(ID)
does.
这篇关于如何使用两个规则从一个变量创建新变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!