在列和名称中拆分字符 [英] Split character in column and name
本文介绍了在列和名称中拆分字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
mydf< - data.frame(name = c(L1,L2,L3),
M1 = c(AC,AT,NA),M2 = c(CC, - ,TC),M3 = c(AT,TT ))
我想将变量M1到M3的字符分割(在实际数据集中我有> 6000个变量)
名称M1a M1b M2a M2b M3a M3b
L1 ACCCAT
L2 AT - - TT
L3 NA NA TCAG
我尝试了以下代码:
func< - function(x){sapply(strsplit(x,),
match,table = c(A ,T,G, - ,NA))}
odataframe< - data.frame(apply(mydf,1,func))
colnames odataframe)< - 粘贴(rep(names(mydf),each = 2),c(a,b),sep =)
odataframe
解决方案
你去:
splitCol< - function(x){
x< - as.character(x)
x [is.na(x)]< - $$
z< - matrix(unlist(strsplit(x,split = )),ncol = 2,byrow = TRUE)
z [z ==$]< - NA
z
}
newdf < - as.data.frame(do.call(cbind,lapply(mydf [,-1],splitCol)))
名称(newdf)< - paste(rep(names(mydf [,-1 ]),每个= 2),c(a,b),sep =)
newdf< - data.frame(mydf [,1,drop = FALSE],newdf)
newdf
名称M1a M1b M2a M2b M3a M3b
1 L1 ACCCAT
2 L2 AT - - TT
3 L3< NA> < NA T C A G
I want to split characters. Although I have a large dataframe to work, the following small example to show what need to be done.
mydf <- data.frame (name = c("L1", "L2", "L3"),
M1 = c("AC", "AT", NA), M2 = c("CC", "--", "TC"), M3 = c("AT", "TT", "AG"))
I want to split the characters for variables M1 to M3 (in real dataset I have > 6000 variables)
name M1a M1b M2a M2b M3a M3b
L1 A C C C A T
L2 A T - - T T
L3 NA NA T C A G
I tried the following codes:
func<- function(x) {sapply( strsplit(x, ""),
match, table= c("A","C","T","G", "--", NA))}
odataframe <- data.frame(apply(mydf, 1, func) )
colnames(odataframe) <- paste(rep(names(mydf), each = 2), c("a", "b"), sep = "")
odataframe
解决方案
Here you go:
splitCol <- function(x){
x <- as.character(x)
x[is.na(x)] <- "$$"
z <- matrix(unlist(strsplit(x, split="")), ncol=2, byrow=TRUE)
z[z=="$"] <- NA
z
}
newdf <- as.data.frame(do.call(cbind, lapply(mydf[, -1], splitCol)))
names(newdf) <- paste(rep(names(mydf[, -1]), each=2), c("a", "b"), sep="")
newdf <- data.frame(mydf[, 1, drop=FALSE], newdf)
newdf
name M1a M1b M2a M2b M3a M3b
1 L1 A C C C A T
2 L2 A T - - T T
3 L3 <NA> <NA T C A G
这篇关于在列和名称中拆分字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文