如何根据外部向量过滤表的行? [英] How to filter a table's row based on an external vector?

查看:16
本文介绍了如何根据外部向量过滤表的行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(1) 我有一个用 R 读取的大表,其中有超过 10000 行和 10 列.

(1) I have a large table read in R with more than a 10000 of rows and 10 columns.

(2) 表格的第 3 列包含医院名称.其中一些是重复的,甚至更多.

(2) The 3rd column of the table contain the name of the hospitals. Some of them are duplicated or even more.

(3) 我有一个医院名称的向量,例如其中10个有待进一步研究.

(3) I have a vector of hospitals' name, e.g. 10 of them are needed to be study further.

(4) 你介意教我如何提取步骤 1 中的所有行和步骤 3 中列出的名称吗?

(4) Could you mind to teach me how to extract all the rows in step1 with the names listed in step 3?

这是我的输入文件的简短示例;

Here is a shorter example of my input file;

Patients Treatment Hospital Response 
1        A         YYY      Good 
2        B         YYY      Dead 
3        A         ZZZ      Good 
4        A         WWW      Good 
5        C         UUU      Dead

我有一个我有兴趣进一步研究的医院向量,即YYYUUU.如何用R生成如下输出表?

I have a vector of hospital that I am interested to study further, i.e YYY and UUU. How to generate a output table as follows with R?

Patients Treatment Hospital Response 
1        A         YYY      Good 
2        B         YYY      Dead 
5        C         UUU      Dead

推荐答案

使用 %in% 操作符.

#Sample data
dat <- data.frame(patients = 1:5, treatment = letters[1:5],
  hospital = c("yyy", "yyy", "zzz", "www", "uuu"), response = rnorm(5))

#List of hospitals we want to do further analysis on
goodHosp <- c("yyy", "uuu")

您可以直接索引到 data.frame 对象中:

You can either index directly into your data.frame object:

dat[dat$hospital %in% goodHosp ,]

或使用子集命令:

subset(dat, hospital %in% goodHosp)

这篇关于如何根据外部向量过滤表的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆