R - 如何删除数据帧的两个准同一行？ [英] R - How delete two quasi-identical rows of a data frame?

查看：107 发布时间：2017/3/26 2:17:00 r dataframe delete-row

本文介绍了R - 如何删除数据帧的两个准同一行？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个数据框，我需要根据两个变量进行去除，但是这两个变量在行中都是准相同的。这意味着他们可以有一个 - 或'或 s 或：或一行中的空格，但在另一行中没有空格。
我确实使用了 unique（），但是这个函数只能使用相同的值。假设我们有这个 data.frame

  Id< -c RoLu1976，Rolu1976，AlBl1989，Thaa1996）
艺术< -c（计量经济学评估：批判，计量经济学评估评论，自由裁量权和卢卡斯的非中性）
 Id.1 <-c（FiKy1989，EdPr1986，BeBe1983，JoSt1989）
 Art.1 <-c批评，卢卡斯批评注释，最优计划的不一致，最优计划的不一致）
 N< -data.frame（Id，Art，Id.1，Art.1）

准相同的值位于变量 Art 在第一次观察中，它们不同于 s 和：。如何过滤和删除这些值？

解决方案

根据您的数据，我使用 agrep 来匹配类似的字符串：

  yy = NULL 
 for（i in 1：length（N $ Art））{
 temp = agrep（N [i，Art]，N $ Art，value = T）
y = ifelse（any（N [i，Art] == temp），temp [1]我，艺术））
 yy = c（yy，y）
}

然后用 yy 替换 N $ Art ，这将允许您使用重复/ unique ：

  N $ Art = yy 
 N.2 = N [！重复（N $ Art），]

I have a data frame, and i need depurate it according with two variables but both variables are "quasi-identical" in the rows. It mean that they can have a - or ' or s or :or a space in one row but in another row dont have it. I did use unique()but this function only works with identical values. Suppose that we have this data.frame

Id<-c("RoLu1976","Rolu1976","AlBl1989","ThSa1996")
Art<-c("Econometric Policy Evaluation: A Critique","Econometric Policy Evaluations A Critique", "Rules after discretion", "Expectations and the Nonneutrality of Lucas")
Id.1<-c("FiKy1989","EdPr1986","BeBe1983","JoSt1989")
Art.1<-c("Notes on the Lucas Critique","Notes on the Lucas Critique","The Inconsistency of Optimal Plans","The Inconsistency of Optimal Plans")
N<-data.frame(Id,Art,Id.1,Art.1)

The quasi identical values are in the variable Art on the two first observation, which are different just for a sand :. How can I filter and delete these kind of values?

解决方案

Based on your data, I used agrep to match similar strings:

yy = NULL
for(i in 1:length(N$Art)){
    temp = agrep(N[i,"Art"],N$Art,value=T)
    y = ifelse(any(N[i,"Art"]==temp),temp[1],N[i,"Art"])
    yy = c(yy,y)
}

Then replaced N$Art with yy, which will allow you to use duplicated/unique:

N$Art = yy
N.2 = N[!duplicated(N$Art), ]

这篇关于R - 如何删除数据帧的两个准同一行？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R - 如何删除数据帧的两个准同一行？ [英] R - How delete two quasi-identical rows of a data frame?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R - 如何删除数据帧的两个准同一行？ [英] R - How delete two quasi-identical rows of a data frame?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭