字典样式替换多个项目 [英] Dictionary style replace multiple items
问题描述
目前我正在进行中,我想要转换的字符数据的大量数据,基于通常称为其他语言的字典。它像这样:
foo< - data.frame(snp1 = c(AA,AG,AA ,AA),snp2 = c(AA,AT,AG,AA),snp3 = c(NA,GG,GG,GC),stringsAsFactors = FALSE )
foo< - 替换(foo,foo ==AA,0101)
foo< - replace(foo,foo ==AC,0102)
foo< - replace(foo,foo ==AG,0103)
工作正常,但显然不漂亮,似乎很愚蠢地重复替换
语句,每次我想替换data.frame中的一个项目。
有没有更好的方法,因为我有一个约25个键/值对的字典?
map = setNames(c(0101,0102,0103),c(AA,AC,AG))
$ p $假设
foo []< - map [unlist(foo)]
map
涵盖foo
中的所有案例,p>
。如果
$ b,这将不太像一个黑客,而且在空间和时间上都更有效率foo
是一个矩阵(字符()),那么
$ b矩阵(map [foo],nrow = nrow(foo),dimnames = dimnames(foo))
当有数百万个SNP和数千个样本时,矩阵和数据帧变体都会与R的2 ^ 31-1限制相关。 p>
I have a large data.frame of character data that I want to convert based on what is commonly called a dictionary in other languages.
Currently I am going about it like so:
foo <- data.frame(snp1 = c("AA", "AG", "AA", "AA"), snp2 = c("AA", "AT", "AG", "AA"), snp3 = c(NA, "GG", "GG", "GC"), stringsAsFactors=FALSE) foo <- replace(foo, foo == "AA", "0101") foo <- replace(foo, foo == "AC", "0102") foo <- replace(foo, foo == "AG", "0103")
This works fine, but it is obviously not pretty and seems silly to repeat the
replace
statement each time I want to replace one item in the data.frame.Is there a better way to do this since I have a dictionary of approximately 25 key/value pairs?
解决方案map = setNames(c("0101", "0102", "0103"), c("AA", "AC", "AG")) foo[] <- map[unlist(foo)]
assuming that
map
covers all the cases infoo
. This would feel less like a 'hack' and be more efficient in both space and time iffoo
were a matrix (of character()), thenmatrix(map[foo], nrow=nrow(foo), dimnames=dimnames(foo))
Both matrix and data frame variants run afoul of R's 2^31-1 limit on vector size when there are millions of SNPs and thousands of samples.
这篇关于字典样式替换多个项目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!