在熔化后使用模式将'variable'列的数字表示转换为原始字符串 [英] Convert numeric representation of 'variable' column to original string following melt using patterns

查看:129
本文介绍了在熔化后使用模式将'variable'列的数字表示转换为原始字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 data.table :: melt()中的 patterns()参数具有包含多个容易定义的模式的列。它是工作,但我没有看到我如何可以创建一个字符索引变量,而不是默认的数字细分。

I am using the patterns() argument in data.table::melt() to melt data that has columns that have several easily-defined patterns. It is working, but I'm not seeing how I can create a character index variable instead of the default numeric breakdown.

例如,在A中,狗和猫列编号...查看变量列:

For example, in A the dog and cat columns are numbered... take a look at at the "variable" column:

A = data.table(idcol = c(1:5),
            dog_1 = c(1:5),   cat_1 = c(101:105),
            dog_2 = c(6:10),  cat_2 = c(106:110),
            dog_3 = c(11:15), cat_3 = c(111:115))   
head(melt(A, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat")))

   idcol variable dog cat
1:     1        1   1 101
2:     2        1   2 102
3:     3        1   3 103
4:     4        1   4 104
5:     5        1   5 105
6:     1        2   6 106

但是,在B中,狗和猫列编号文本,但变量列仍是数字。

However, in B the dog and cat columns are numbered with text, but the "variable" column is still numeric.

B = data.table(idcol = c(1:5),
                dog_one = c(1:5),     cat_one = c(101:105),
                dog_two = c(6:10),    cat_two = c(106:110),
                dog_three = c(11:15), cat_three = c(111:115))
head(melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat")))

   idcol variable dog cat
1:     1        1   1 101
2:     2        1   2 102
3:     3        1   3 103
4:     4        1   4 104
5:     5        1   5 105
6:     1        2   6 106

推荐答案

可能会更容易方法,但这似乎工作:

There might be easier ways, but this seems to work:

# grab suffixes of 'variable' names
suff <- unique(sub('^.*_', '', names(B[ , -1])))
# suff <- unique(tstrsplit(names(B[, -1]), "_")[[2]])

# melt
B2 <- melt(B, measure = patterns("^dog", "^cat"), value.name = c("dog", "cat"))

# replace factor levels in 'variable' with the suffixes
setattr(B2$variable, "levels", suff)

B2
#     idcol variable dog cat
# 1:      1      one   1 101
# 2:      2      one   2 102
# 3:      3      one   3 103
# 4:      4      one   4 104
# 5:      5      one   5 105
# 6:      1      two   6 106
# 7:      2      two   7 107
# 8:      3      two   8 108
# 9:      4      two   9 109
# 10:     5      two  10 110
# 11:     1    three  11 111
# 12:     2    three  12 112
# 13:     3    three  13 113
# 14:     4    three  14 114
# 15:     5    three  15 115

请注意,本主题有一个未解决的问题,有其他一些替代方法: FR:扩展熔体功能以处理输出的名称

Note that there is an open issue on this topic with some other alternatives: FR: expansion of melt functionality for handling names of output.

这是一个(罕见的)实例,我相信good'ol base ::重塑是更清洁。它的 sep 参数在这里很方便 - 值列的名称和变量列的级别一次性生成:

This is one of the (rare) instances where I believe good'ol base::reshape is cleaner. Its sep argument comes in handy here — both the names of the 'value' column and the levels of the 'variable' columns are generated in one go:

reshape(data = B,
        varying = names(B[ , -1]),
        sep = "_",
        direction = "long")

这篇关于在熔化后使用模式将'variable'列的数字表示转换为原始字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆