熔化数组并使数值字符 [英] Melt a array and make numeric values character
问题描述
我有一个数组,我想根据dimnames融化它。问题是维度名称是大的数字值,因此使它们的字符将它们转换为错误的ID参见示例:
test < - array(1:18,dim = c(3,3,2),dimnames = list(c(00901291282245454545454,329293929929292,2929992929922929),
c(a,b,c ),
c(d,e)))
library(reshape2)
library(data.table)
test2 <
test2 [,Var1:= as.character(Var1)]
> test2
Var1 Var2 Var3 value
1:9.01291282245455e + 20 ad 1
2:329293929929292 ad 2
3:2929992929922929 ad 3
4:9.01291282245455e + 20 bd 4
5:329293929929292 bd 5
6:2929992929922929 bd 6
7:9.01291282245455e + 20 cd 7
8:329293929929292 cd 8
9:2929992929922929 cd 9
10:9.01291282245455e + 20 ae 10
11:329293929929292 ae 11
12:2929992929922929 ae 12
13:9.01291282245455e + 20 be 13
14:329293929929292 be 14
15:2929992929922929 be 15
16:9.01291282245455e + 20 ce 16
17:329293929929292 ce 17
18:2929992929922929 ce 18
如何使第一列有大ID字符?我现在做的是将字符字母粘贴到dimnames然后融化,使它成为一个字符,然后采取一个子字符串,这是真的效率低下。重要的是,它是一个有效的解决方案,因为数据集是数百万行。有两个问题,首先0被删除,如果他们在ID前面,并将其转换为e + 20字符。
您需要将您的dimnames定义为 自定义 您的示例中的新函数( EDIT 在开发版本 会给出与上面相同的结果( I have a array and I want to melt it based on the dimnames. The problem is that the dimension names are large numeric values and therefore making them character would convert them to a wrong ID see the example: How could I make the first column with the large IDs character? What I am currently doing is pasting a character letter to the dimnames and then melt, making it a character and then take a substring, which is really inefficient. It is important that it is an efficient solution because the dataset is millions of rows. There are two problems,first the 0's are deleted if they are in front of the ID and it is converted to a e+20 character. You need to define your dimnames as Customise Try the new function on your example (with EDIT In the development version of will give you the same result as above (with 这篇关于熔化数组并使数值字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!字符
,然后轻轻地修改 melt.array
:
:
$
: $
test <-array(1:18,dim = c(3,3,2),dimnames = list(c(00901291282245454545454,329293929929292,2929992929922929),
c (a,b,c),
c(d,e)))
melt.array
添加一个参数,允许您决定是否要转换:
melt.array2 < - function(data,varnames = names(dimnames(data)),conv = TRUE,...)
{
$< - as.vector(data)
dn < - dimnames(data)
if(is.null(dn))
dn < ,length(dim(data)))
dn_missing < - sapply(dn,is.null)
dn [dn_missing] )[dn_missing]
if(conv){#conv是知道是否需要转换的新参数
char< - sapply(dn,is.character)
dn [char ] < - lapply(dn [char],type.convert)
}
索引< - do.call(expand.grid,dn)
names(indices)< - varnames
data.frame(indices,value = values)
}
conv = FALSE
):
(melt.array2(test,conv = FALSE))
#X1 X2 X3 value
#1 00901291282245454545454 ad 1
#2 329293929929292 ad 2
#3 2929992929922929 ad 3
#4 00901291282245454545454 bd 4
#5 329293929929292 bd 5
#6 2929992929922929 bd 6
$ b b
reshape2
( devtools :: install_github(hadley / reshape)
, melt.array
有不同的定义,你可以使用参数 as.is
以避免转换:
)
Var1
etc而不是 X1
等)。test <- array(1:18, dim = c(3,3,2), dimnames = list(c(00901291282245454545454,329293929929292,2929992929922929),
c("a", "b", "c"),
c("d", "e")))
library(reshape2)
library(data.table)
test2 <- data.table(melt(test))
test2[, Var1 := as.character(Var1)]
> test2
Var1 Var2 Var3 value
1: 9.01291282245455e+20 a d 1
2: 329293929929292 a d 2
3: 2929992929922929 a d 3
4: 9.01291282245455e+20 b d 4
5: 329293929929292 b d 5
6: 2929992929922929 b d 6
7: 9.01291282245455e+20 c d 7
8: 329293929929292 c d 8
9: 2929992929922929 c d 9
10: 9.01291282245455e+20 a e 10
11: 329293929929292 a e 11
12: 2929992929922929 a e 12
13: 9.01291282245455e+20 b e 13
14: 329293929929292 b e 14
15: 2929992929922929 b e 15
16: 9.01291282245455e+20 c e 16
17: 329293929929292 c e 17
18: 2929992929922929 c e 18
character
and then slighly modify melt.array
which is called when you do melt
on your array
:test <- array(1:18, dim = c(3,3,2), dimnames = list(c("00901291282245454545454", "329293929929292", "2929992929922929"),
c("a", "b", "c"),
c("d", "e")))
melt.array
to add a parameter which permits to decide wether you want the conversion or not:melt.array2 <- function (data, varnames = names(dimnames(data)), conv=TRUE, ...)
{
values <- as.vector(data)
dn <- dimnames(data)
if (is.null(dn))
dn <- vector("list", length(dim(data)))
dn_missing <- sapply(dn, is.null)
dn[dn_missing] <- lapply(dim(data), function(x) 1:x)[dn_missing]
if(conv){ # conv is the new parameter to know if conversion needs to be done
char <- sapply(dn, is.character)
dn[char] <- lapply(dn[char], type.convert)
}
indices <- do.call(expand.grid, dn)
names(indices) <- varnames
data.frame(indices, value = values)
}
conv=FALSE
):head(melt.array2(test, conv=FALSE))
# X1 X2 X3 value
# 1 00901291282245454545454 a d 1
# 2 329293929929292 a d 2
# 3 2929992929922929 a d 3
# 4 00901291282245454545454 b d 4
# 5 329293929929292 b d 5
# 6 2929992929922929 b d 6
reshape2
(devtools::install_github("hadley/reshape")
, melt.array
is differently defined and you can use parameter as.is
to avoid the conversion: melt(test, as.is=TRUE)
Var1
etc instead of X1
etc).