字典样式替换多个项目 [英] Dictionary style replace multiple items

查看:87
本文介绍了字典样式替换多个项目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



目前我正在进行中,我想要转换的字符数据的大量数据,基于通常称为其他语言的字典。它像这样:

  foo<  -  data.frame(snp1 = c(AA,AG,AA ,AA),snp2 = c(AA,AT,AG,AA),snp3 = c(NA,GG,GG,GC),stringsAsFactors = FALSE )
foo< - 替换(foo,foo ==AA,0101)
foo< - replace(foo,foo ==AC,0102)
foo< - replace(foo,foo ==AG,0103)

工作正常,但显然不漂亮,似乎很愚蠢地重复替换语句,每次我想替换data.frame中的一个项目。



有没有更好的方法,因为我有一个约25个键/值对的字典?

解决方案

  map = setNames(c(0101,0102,0103),c(AA,AC,AG))
foo []< - map [unlist(foo)]
map 涵盖 foo 中的所有案例,p>

。如果 foo 是一个矩阵(字符()),那么

$ b,这将不太像一个黑客,而且在空间和时间上都更有效率
$ b

 矩阵(map [foo],nrow = nrow(foo),dimnames = dimnames(foo))

当有数百万个SNP和数千个样本时,矩阵和数据帧变体都会与R的2 ^ 31-1限制相关。 p>

I have a large data.frame of character data that I want to convert based on what is commonly called a dictionary in other languages.

Currently I am going about it like so:

foo <- data.frame(snp1 = c("AA", "AG", "AA", "AA"), snp2 = c("AA", "AT", "AG", "AA"), snp3 = c(NA, "GG", "GG", "GC"), stringsAsFactors=FALSE)
foo <- replace(foo, foo == "AA", "0101")
foo <- replace(foo, foo == "AC", "0102")
foo <- replace(foo, foo == "AG", "0103")

This works fine, but it is obviously not pretty and seems silly to repeat the replace statement each time I want to replace one item in the data.frame.

Is there a better way to do this since I have a dictionary of approximately 25 key/value pairs?

解决方案

map = setNames(c("0101", "0102", "0103"), c("AA", "AC", "AG"))
foo[] <- map[unlist(foo)]

assuming that map covers all the cases in foo. This would feel less like a 'hack' and be more efficient in both space and time if foo were a matrix (of character()), then

matrix(map[foo], nrow=nrow(foo), dimnames=dimnames(foo))

Both matrix and data frame variants run afoul of R's 2^31-1 limit on vector size when there are millions of SNPs and thousands of samples.

这篇关于字典样式替换多个项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆