如何根据外部向量过滤表的行? [英] How to filter a table's row based on an external vector?
问题描述
(1) 我有一个用 R 读取的大表,其中有超过 10000 行和 10 列.
(1) I have a large table read in R with more than a 10000 of rows and 10 columns.
(2) 表格的第 3 列包含医院名称.其中一些是重复的,甚至更多.
(2) The 3rd column of the table contain the name of the hospitals. Some of them are duplicated or even more.
(3) 我有一个医院名称的向量,例如其中10个有待进一步研究.
(3) I have a vector of hospitals' name, e.g. 10 of them are needed to be study further.
(4) 你介意教我如何提取步骤 1 中的所有行和步骤 3 中列出的名称吗?
(4) Could you mind to teach me how to extract all the rows in step1 with the names listed in step 3?
这是我的输入文件的简短示例;
Here is a shorter example of my input file;
Patients Treatment Hospital Response
1 A YYY Good
2 B YYY Dead
3 A ZZZ Good
4 A WWW Good
5 C UUU Dead
我有一个我有兴趣进一步研究的医院向量,即YYY
和UUU
.如何用R生成如下输出表?
I have a vector of hospital that I am interested to study further, i.e YYY
and UUU
. How to generate a output table as follows with R?
Patients Treatment Hospital Response
1 A YYY Good
2 B YYY Dead
5 C UUU Dead
推荐答案
使用 %in%
操作符.
#Sample data
dat <- data.frame(patients = 1:5, treatment = letters[1:5],
hospital = c("yyy", "yyy", "zzz", "www", "uuu"), response = rnorm(5))
#List of hospitals we want to do further analysis on
goodHosp <- c("yyy", "uuu")
您可以直接索引到 data.frame 对象中:
You can either index directly into your data.frame object:
dat[dat$hospital %in% goodHosp ,]
或使用子集命令:
subset(dat, hospital %in% goodHosp)
这篇关于如何根据外部向量过滤表的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!