基于列名的子集列 [英] subset columns based on column names

查看:50
本文介绍了基于列名的子集列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个ID为df1的

I have a df1 with ids

df1 <- read.table(text="ID
8765
                    1879
                    8706
                    1872
                    0178
                    0268
                    0270
                    0269
                    0061
                    0271", header=T)

第二个df2,其列名

> names(df2)
 [1] "TW_3784.IT"   "TW_3970.IT"   "TW_1879.IT"   "TW_0178.IT"   "SF_0271.IT" "TW_3782.IT"  
 [7] "TW_3783.IT"   "TW_8765.IT"   "TW_8706.IT"   "SF_0268.IT" "SF_0270.IT" "SF_0269.IT"
[13] "SF_0061.IT"

我需要的是仅保留df2中与df1部分匹配的列

What i need is to keep only columns from df2 that partial match with df1

df3 = df2 %>% 
  dplyr::select(df2 , dplyr::contains(df1$ID))
error

Error in dplyr::contains(df1$ID) : is_string(match) is not TRUE

使用grepl

df3 = df2[,grepl(df1$ID, names(df2))]

error
In grepl(df1$ID, names(df2)) :
  argument 'pattern' has length > 1 and only the first element will be used

推荐答案

这是使用 dplyr 包的解决方案.

Here's a solution that uses the dplyr package.

df2 %>% select(matches(paste(df1$ID, collapse = "|")))

这会将 df1 中的 ID s与 | 作为分隔符(意为逻辑 OR )粘贴在一起这个:

This pastes together the IDs from df1 with | as a separator (meaning logical OR) like this:

"8765|1879|8706|1872|178|268|270|269|61|271"

这是必需的,因为 matches 然后查找与这些数字中一个或另一个匹配的列名,然后对这些列进行 select 选择. select 匹配以及%>%都需要 dplyr .

This is needed as matches then looks for columns names that matches one OR another of these numbers and these columns are then selected. dplyr is needed for select, matches and also %>%.

这篇关于基于列名的子集列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆