子集数据帧,使得每行中的所有值都小于某个值 [英] Subset dataframe such that all values in each row are less than a certain value

查看:16
本文介绍了子集数据帧,使得每行中的所有值都小于某个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有维度列和 4 个值列的数据框.如何对列进行子集化,使每条记录的所有 4 列都小于给定的 x?我知道我可以使用子集手动执行此操作并为每列指定条件,但是有没有办法使用应用函数来执行此操作?下面是一个示例数据框.例如,假设 x 是 0.7.在那种情况下,我想消除该行的任何列超过 0.7 的任何行).

I have a dataframe with a dimension column and 4 value columns. How can I subset the column such that all 4 columns for each record are less than a given x? I know I could do this manually using subset and specifying the condition for each column, but is there a way to do it using maybe an apply function? Below is a sample dataframe. For example let's say the x was 0.7. In that case I would want to eliminate any rows where any column of that row is more than 0.7).

   zips ABC DEF GHI JKL
1     1 0.8 0.6 0.1 0.6
2     2 0.1 0.3 0.8 1.0
3     3 0.5 0.1 0.4 0.8
4     4 0.6 0.4 0.2 0.3
5     5 1.0 0.8 0.6 0.5
6     6 0.2 0.7 0.3 0.4
7     7 0.3 1.0 1.0 0.2
8     8 0.7 0.9 0.5 0.1
9     9 0.9 0.5 0.9 0.7
10   10 0.4 0.2 0.7 0.9

以下功能似乎有效,但有人可以解释一下这里的逻辑吗?

The following function seemed to work, but could someone explain the logic here?

Variance_Percentile[!rowSums(Variance_Percentile[-1] > 0.7), ]
  zips ABC DEF GHI JKL
4    4 0.6 0.4 0.2 0.3
6    6 0.2 0.7 0.3 0.4

推荐答案

您可以对子集使用否定的rowSums()

You can use the negated rowSums() for the subset

df[!rowSums(df[-1] > 0.7), ]
#   zips ABC DEF GHI JKL
# 4    4 0.6 0.4 0.2 0.3
# 6    6 0.2 0.7 0.3 0.4

  • df[-1] >0.7 给了我们一个逻辑矩阵,告诉我们哪些 df[-1] 大于 0.7
  • rowSums() 对这些行求和(每个 TRUE 值等于 1,FALSE 为零)
  • ! 将这些值转换为逻辑值并将它们取反,以便我们得到任何为零 (FALSE) 的行总和并将它们转换为 TRUE.换句话说,如果 rowSums() 结果为零,我们需要这些行.
  • 我们使用该逻辑向量作为行子集
    • df[-1] > 0.7 gives us a logical matrix telling us which df[-1] are greater than 0.7
    • rowSums() sums across those rows (each TRUE value is equal to 1, FALSE is zero)
    • ! converts those values to logical and negates them, so that we get any row sums which are zero (FALSE) and turn them into TRUE. In other words, if the rowSums() result is zero, we want those rows.
    • we use that logical vector for the row subset
    • 获得相同逻辑向量的另一种方法是做

      Another way to get the same logical vector would be to do

      rowSums(df[-1] > 0.7) == 0
      

      这篇关于子集数据帧,使得每行中的所有值都小于某个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆