实施天际线查询或高效边界 [英] Implementation of skyline query or efficient frontier
问题描述
我有一个数据框2列数字。
我想从中删除具有属性的行,数据框中至少存在另一行,两列的值大于此行中的列。
所以如果我有
Col1 Col2
1 2 3
2 4 7
3 5 6
我想删除第一行,因为第二个满足属性,只保留第2行和第3行。
非常感谢!
这个问题被数据库管理员称为天际线查询(他们可能有其他算法)和经济学家的有效前沿。
绘制数据可以清楚我们在寻找什么。
n < - 40
d < - data.frame(
x = rnorm(n),
y = rnorm(n)
)
#我们希望在以下图中的极限点
par(mar = c(1,1,1,1))
plot(d,axes = FALSE,xlab =,ylab =)
for(i in 1:n) {
polygon(c(-10,d $ x [i],d $ x [i], - 10),c(-10,-10,d $ y [i],d $ y [i ]),
col = rgb(.9,.9,.9,.2))
}
算法如下:沿着第一个坐标对点进行排序,
保持每个观察,除非它比最后保留的更差。
d< - d [order(d $ x,decrement = TRUE)]]
result< - d [1,]
for我在seq_len(nrow(d))[ - 1]){
if(d $ y [i]> result $ y [nrow(result)]){
result< - rbind结果,d [i,])#无效
}
}
点(结果,cex = 3,pch = 15)
I know there must be an easy answer to this but somehow I can't seem to find it...
I have a data frame with 2 numeric columns. I would like to remove from it, the rows, which have the property, that there exists at least one other row in the data frame, with both column values bigger than the ones in this row.
So if I have
Col1 Col2
1 2 3
2 4 7
3 5 6
I would like to remove the first row, because the second one fulfills the property and keep only rows 2 and 3.
Thanks a lot!
That problem is called a "skyline query" by database administrators (they may have other algorithms) and an "efficient frontier" by economists. Plotting the data can make it clear what we are looking for.
n <- 40
d <- data.frame(
x = rnorm(n),
y = rnorm(n)
)
# We want the "extreme" points in the following plot
par(mar=c(1,1,1,1))
plot(d, axes=FALSE, xlab="", ylab="")
for(i in 1:n) {
polygon( c(-10,d$x[i],d$x[i],-10), c(-10,-10,d$y[i],d$y[i]),
col=rgb(.9,.9,.9,.2))
}
The algorithm is as follows: sort the points along the first coordinate, keep each observation unless it is worse than the last retained one.
d <- d[ order(d$x, decreasing=TRUE), ]
result <- d[1,]
for(i in seq_len(nrow(d))[-1] ) {
if( d$y[i] > result$y[nrow(result)] ) {
result <- rbind(result, d[i,]) # inefficient
}
}
points(result, cex=3, pch=15)
这篇关于实施天际线查询或高效边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!