在r中搜索数据框中的字符串列表 [英] searching for a list of string in a dataframe in r

查看:316
本文介绍了在r中搜索数据框中的字符串列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名称列表,以及一个包含许多不同列的data.frame。如何检索数据框中行的row.name是我列表中的名称之一。例如,如果我的数据框中的row.names有很多行,包括TC09001536.hg.1,TC03002852.hg.1和TC18000664.hg.1名称(保存在名为Top.list的列表中),则为

假设我的数据帧被称为df,那么我试过了;

test< - df [grep(Top.list,df $ cluster_id) p>

查看cluster_id列,如果匹配列表中的名称,则给我整行。



谢谢

解决方案

这应该是有效的:

  test<  -  df [unlist(lapply(Top.list,function(x)grep(x,df $ cluster_id,fixed = TRUE))),] 
$ b

lapply(Top.list,function(x)grep(x,df $ cluster_id,fixed = TRUE))部分为每个单词生成一个匹配行号的向量列表, unlist 将向量组合为一个向量,从中您的数据框将被子集化。


I have a list of names, and a data.frame with many different columns. How can retrieve rows in the data frame that their row.name is one of the names in my list. for example if the row.names in my data frame has many rows, including TC09001536.hg.1 , TC03002852.hg.1 , and TC18000664.hg.1 names, which are saved in list called Top.list. assuming my data frame is called df then I tried ;

test <- df[grep(Top.list, df$cluster_id),]

to look within cluster_id column and if matches the names in my list then give me whole rows.

thanks

解决方案

This should work:

test <- df[unlist(lapply(Top.list, function(x) grep(x, df$cluster_id, fixed = TRUE))),]

The lapply(Top.list, function(x) grep(x, df$cluster_id, fixed = TRUE)) part generates a list with vectors of matching row numbers for each of your words, the unlist combines the vectors to one vector, from which your dataframe will be subsetted.

这篇关于在r中搜索数据框中的字符串列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆