距数据框中最接近的非NA值的距离 [英] Distance from the closest non NA value in a dataframe

查看：37 发布时间：2020/10/17 0:23:22 r dataframe

本文介绍了距数据框中最接近的非NA值的距离的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我具有以下数据框df，我想添加一列，该列的距离应为每行与最接近的非NA值的距离。

I have the following dataframe df and I want to add a column with the distance from the closest non NA value for each row.

df <- data.frame(x = 1:20)
df[c(1, 3, 4, 5, 11, 14, 15, 16), "x"] <-  NA

换句话说，我正在寻找以下值：

In other words, I am looking for the following values:

df$distance <- c(1, 0, 1, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 2, 1, 0, 0, 0, 0)

如何自动执行此操作？

推荐答案

让 x 为包含 NA的向量，您的问题是

a <- which(!is.na(x))
b <- which(is.na(x))

找到 min（abs（a-b [i]））每 b [i] 。

使用R代码有效很难实现这种任务。用编译的代码编写循环通常是一个更好的选择。除非某些软件包中的某些功能已经为我们做到了。

This type of task is not easily to be accomplished efficiently with R code. Writing a loop with compiled code is generally a better choice; unless there is some function from some package that already does this for us.

以下是一些幼稚但简单的解决方案。

Some naive but straightforward solutions are the following.

如果 x 不太长，我们可以使用 outer ：

If x is not too long, we can use outer:

distance <- numeric(length(x))
distance[is.na(x)] <- apply(abs(outer(a, b, "-")), 2L, min)

如果时间较长且内存使用量为外部成为问题，我们可能会这样做

If it is long and memory usage of outer becomes a problem, we might do

distance <- numeric(length(x))
distance[is.na(x)] <- sapply(b, function (bi) min(abs(bi - a)))

请注意，鉴于该算法，所有方法都不是真正有效的。

Note, none of the methods is truly efficient in view of the algorithm.

这篇关于距数据框中最接近的非NA值的距离的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

距数据框中最接近的非NA值的距离 [英] Distance from the closest non NA value in a dataframe

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

距数据框中最接近的非NA值的距离 [英] Distance from the closest non NA value in a dataframe

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭