用子字符串替换数据帧的rownames [英] Replacing rownames of data frame by a sub-string
本文介绍了用子字符串替换数据帧的rownames的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
> rownames(test)
[1]U2OS.EV.2.7.9U2OS.PIM.2.7.9U2OS.WDR.2.7.9U2OS.MYC.2.7.9
[5]U2OS.OBX.2.7.9U2OS.EV.18.6.9U2O2.PIM.18.6.9U2OS.WDR.18.6.9
[9]U2OS。 MYC.18.6.9U2OS.OBX.18.6.9X1.U2OS ... OBXX2.U2OS ... MYC
[13]X3.U2OS ... WDR82 X4.U2OS ... PIMX5.U2OS ... EVexp1.U2OS.EV
[17]exp1.U2OS.MYCEXP1.U20S..PIM1EXP1.U2OS .WDR82EXP1.U20S.OBX
[21]EXP2.U2OS.EVEXP2.U2OS.MYCEXP2.U2OS.PIM1EXP2.U2OS.WDR82
[ 25]EXP2.U2OS.OBX
如您所见,部分行名称相同的部分名称。例如,每一行名称都是 MYC
我想将整个行名更改为MYC。总体而言,行名称包含5个因素: MYC
, EV
, PIM
, WDR
和 OBX
。
解决方案
正如@teucer指出的那样,你不能有重复的行名。相反,您在数据框架中创建一个新列,并使用简单的正则表达式来提取您的因素。例如,
##您的行名称
x = c(U2OS.EV.2.7.9, U2OS.PIM.2.7.9,U2OS.WDR.2.7.9,U2OS.MYC.2.7.9,
U2OS.OBX.2.7.9,U2OS.EV.18.6。 9,U2O2.PIM.18.6.9,U2OS.WDR.18.6.9,
U2OS.MYC.18.6.9,U2OS.OBX.18.6.9,X1。 U2OS ... OBX,X2.U2OS ... MYC)
test $ rnames = gsub(。*(MYC | EV | PIM | WDR | OBX) \\1,x)
I have a large dataframe (named test) with different rownames.
> rownames(test)
[1] "U2OS.EV.2.7.9" "U2OS.PIM.2.7.9" "U2OS.WDR.2.7.9" "U2OS.MYC.2.7.9"
[5] "U2OS.OBX.2.7.9" "U2OS.EV.18.6.9" "U2O2.PIM.18.6.9" "U2OS.WDR.18.6.9"
[9] "U2OS.MYC.18.6.9" "U2OS.OBX.18.6.9" "X1.U2OS...OBX" "X2.U2OS...MYC"
[13] "X3.U2OS...WDR82" "X4.U2OS...PIM" "X5.U2OS...EV" "exp1.U2OS.EV"
[17] "exp1.U2OS.MYC" "EXP1.U20S..PIM1" "EXP1.U2OS.WDR82" "EXP1.U20S.OBX"
[21] "EXP2.U2OS.EV" "EXP2.U2OS.MYC" "EXP2.U2OS.PIM1" "EXP2.U2OS.WDR82"
[25] "EXP2.U2OS.OBX"
As you could see, part of the row names have the same partial name. For example every row with partial name MYC
I want to change the whole rowname into "MYC". Overall the row names contain 5 factors: MYC
, EV
, PIM
, WDR
and OBX
.
解决方案
As @teucer points out, you can't have duplicate row names. Instead, you create a new column in your data frame and use a simple regular expression to extract your factors. For example,
## Your row names
x = c("U2OS.EV.2.7.9", "U2OS.PIM.2.7.9", "U2OS.WDR.2.7.9", "U2OS.MYC.2.7.9",
"U2OS.OBX.2.7.9", "U2OS.EV.18.6.9", "U2O2.PIM.18.6.9","U2OS.WDR.18.6.9",
"U2OS.MYC.18.6.9","U2OS.OBX.18.6.9", "X1.U2OS...OBX","X2.U2OS...MYC")
test$rnames = gsub(".*(MYC|EV|PIM|WDR|OBX).*", "\\1", x)
这篇关于用子字符串替换数据帧的rownames的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文