如何根据R中的条件填充数据框 [英] How to populate a data frame based on condition in R

查看:384
本文介绍了如何根据R中的条件填充数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个这样的空数据框架

I created a empty data frame something like this

                 id Alyr Crub Lala Brap Bole Spar Esal Aara Thas
1 XLOC_003940_TBH_1   NA   NA   NA   NA   NA   NA   NA   NA   NA

我想看看 id 和列名匹配,那么它应该用一定的值替换NA。以下是一个示例:

I wanted to see if id and column name match then it should replace "NA" with certain value. Here is an example:

ex1 <- "Alyr_XLOC_003940_TBH_1_Ortholog_Known_Gene_Sense"

sp <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\1", ex1)
gene <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\2", ex1)
fun <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\3", ex1)

根据上面的例子,我想得到这样的东西

Based on the above example, i wanted to get something like this

                 id        Alyr                  Crub Lala Brap Bole Spar Esal Aara Thas
1 XLOC_003940_TBH_1   Ortholog_Known_Gene_Sense   NF   NF   NF   NF   NF   NF   NF   NF

我被困在这里,无法看出我该怎么做?

I am stuck here and can't figure how can i do this?

推荐答案

使用矩阵子集:

df1$id <- gene
df1[cbind(1:nrow(df1), match(sp, names(df1)))] <- fun






查看此回答了解更多关于按两列矩阵对数据框进行子集。


Check this answer for more on subsetting a data frame by a two-column matrix.

##Example
nms <- scan(what="character", text="id Alyr Crub Lala Brap Bole Spar Esal Aara Thas")
df1 <- as.data.frame(matrix(NA, 3, 10))
names(df1) <- nms
df1
#  id Alyr Crub Lala Brap Bole Spar Esal Aara Thas
#1 NA   NA   NA   NA   NA   NA   NA   NA   NA   NA
#2 NA   NA   NA   NA   NA   NA   NA   NA   NA   NA
#3 NA   NA   NA   NA   NA   NA   NA   NA   NA   NA


ex1 <- c("Alyr_XLOC_003940_TBH_1_Ortholog_Gene",
         "Lala_XLOC_1234_TBH_1_Lalala_Gene",
         "Thas_XLOC_5678_TBH_1_Thasthas_Gene")

sp <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\1", ex1)
gene <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\2", ex1)
fun <- sub("([A-Za-z]+)_(XLOC_\\d+_TBH_1)_([A-Za-z_]+)","\\3", ex1)

df1$id <- gene
df1[cbind(1:nrow(df1), match(sp, names(df1)))] <- fun
df1
  #                  id          Alyr Crub        Lala Brap Bole Spar Esal Aara          Thas
  # 1 XLOC_003940_TBH_1 Ortholog_Gene   NA        <NA>   NA   NA   NA   NA   NA          <NA>
  # 2   XLOC_1234_TBH_1          <NA>   NA Lalala_Gene   NA   NA   NA   NA   NA          <NA>
  # 3   XLOC_5678_TBH_1          <NA>   NA        <NA>   NA   NA   NA   NA   NA Thasthas_Gene

这篇关于如何根据R中的条件填充数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆