子集数据帧,使每行中的所有值都小于某个值 [英] Subset dataframe such that all values in each row are less than a certain value
问题描述
以下是一个示例数据框。例如让我们说x是0.7。在这种情况下,我想删除任何行,该列的任何列超过0.7)。
拉链ABC DEF GHI JKL
1 1 0.8 0.6 0.1 0.6
2 2 0.1 0.3 0.8 1.0
3 3 0.5 0.1 0.4 0.8
4 4 0.6 0.4 0.2 0.3
5 5 1.0 0.8 0.6 0.5
6 6 0.2 0.7 0.3 0.4
7 7 0.3 1.0 1.0 0.2
8 8 0.7 0.9 0.5 0.1
9 9 0.9 0.5 0.9 0.7
10 10 0.4 0.2 0.7 0.9
以下函数似乎有效,但有人可以在这里解释逻辑吗?
Variance_Percentile [!rowSums(Variance_Percentile [-1]> 0.7),]
拉链ABC DEF GHI JKL
4 4 0.6 0.4 0.2 0.3
6 6 0.2 0.7 0.3 0.4
您可以使用否定的 rowSums()
进行子集
df [!rowSums(df [-1]> 0.7),]
#拉链ABC DEF GHI JKL
#4 4 0.6 0.4 0.2 0.3
#6 6 0.2 0.7 0.3 0.4
-
df [-1 ] 0.7
给了我们一个逻辑矩阵,告诉我们哪些df [-1]
大于0.7 - code> rowSums()这些行的总和(每个TRUE值等于1,FALSE为零)
-
!
将这些值转换为逻辑值并对它们进行否定,这样我们可以得到任何零(FALSE)的行和,并将它们变为TRUE。换句话说,如果rowSums()
结果为零,我们希望这些行。 - 我们使用该行的逻辑向量子集
获得相同逻辑向量的另一种方法是执行
rowSums(df [-1]> 0.7)== 0
I have a dataframe with a dimension column and 4 value columns. How can I subset the column such that all 4 columns for each record are less than a given x? I know I could do this manually using subset and specifying the condition for each column, but is there a way to do it using maybe an apply function? Below is a sample dataframe. For example let's say the x was 0.7. In that case I would want to eliminate any rows where any column of that row is more than 0.7).
zips ABC DEF GHI JKL
1 1 0.8 0.6 0.1 0.6
2 2 0.1 0.3 0.8 1.0
3 3 0.5 0.1 0.4 0.8
4 4 0.6 0.4 0.2 0.3
5 5 1.0 0.8 0.6 0.5
6 6 0.2 0.7 0.3 0.4
7 7 0.3 1.0 1.0 0.2
8 8 0.7 0.9 0.5 0.1
9 9 0.9 0.5 0.9 0.7
10 10 0.4 0.2 0.7 0.9
The following function seemed to work, but could someone explain the logic here?
Variance_Percentile[!rowSums(Variance_Percentile[-1] > 0.7), ]
zips ABC DEF GHI JKL
4 4 0.6 0.4 0.2 0.3
6 6 0.2 0.7 0.3 0.4
You can use the negated rowSums()
for the subset
df[!rowSums(df[-1] > 0.7), ]
# zips ABC DEF GHI JKL
# 4 4 0.6 0.4 0.2 0.3
# 6 6 0.2 0.7 0.3 0.4
df[-1] > 0.7
gives us a logical matrix telling us whichdf[-1]
are greater than 0.7rowSums()
sums across those rows (each TRUE value is equal to 1, FALSE is zero)!
converts those values to logical and negates them, so that we get any row sums which are zero (FALSE) and turn them into TRUE. In other words, if therowSums()
result is zero, we want those rows.- we use that logical vector for the row subset
Another way to get the same logical vector would be to do
rowSums(df[-1] > 0.7) == 0
这篇关于子集数据帧,使每行中的所有值都小于某个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!