如何基于涉及字段的条件提取数据帧的子集? [英] how to extract a subset of a data frame based on a condition involving a field?

查看:139
本文介绍了如何基于涉及字段的条件提取数据帧的子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型CSV,其中包含来自不同地点的医疗调查结果(该位置是数据中存在的因素)。由于一些分析是特定于一个位置,为了方便,我想提取子帧,只有从这些位置的行。它发生的位置是第一个字段,所以是的,我可以通过排序CSV行,但我想学习如何做到R,因为我确定我需要这个为其他列。



因此,简而言之,问题是:给定一个数据框foo,如何创建另一个数据框,其中只包含foo的行foo $ location ='there'?



非常感谢。

解决方案

是两种主要方法。我更喜欢这个可读性:

  bar < -  subset(foo,location ==there)$ b $请注意,您可以使用& 将许多条件字符串连接在一起。  | 来创建复杂的子集。



第二种是索引方法。您可以使用数字或布尔切片来索引R中的行。 foo $ location ==there返回向量 T F 值的长度与 foo 的行长度相同。您可以这样做,只返回条件返回true的行。

  foo [foo $ location ==there 


I have a large CSV with the results of a medical survey from different locations (the location is a factor present in the data). As some analyses are specific to a location and for convenience, I'd like to extract subframes with the rows only from those locations. It happens that the location is the very first field so yes, I could do it by sorting the CSV rows, but I'd like to learn how to do it in R as I'm sure I'll need this for other columns.

So, in a nutshell, the question is: given a data frame foo, how can I create another data frame bar which only contains the rows from foo where foo$location = 'there'?

Thanks a lot.

解决方案

Here are the two main approaches. I prefer this one for its readability:

bar <- subset(foo, location == "there")

Note that you can string together many conditionals with & and | to create complex subsets.

The second is the indexing approach. You can index rows in R with either numeric, or boolean slices. foo$location == "there" returns a vector of T and F values that is the same length as the rows of foo. You can do this to return only rows where the condition returns true.

foo[foo$location == "there", ]

这篇关于如何基于涉及字段的条件提取数据帧的子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆