在从数据集中删除重复项之后,无法保留所有变量 [英] Having trouble keeping all variables after removing duplicates from a dataset
问题描述
mav2< - unique(mav [,c(2,5,6) ])
生成的 mav2
数据框生成55观察,摆脱所有的重复!不幸的是,它也摆脱了我在独特命令中没有使用的其他五个变量(1,3,4,7和8)。我最初尝试添加两个数据框,当然这并不奏效,因为它们的大小不等。我也尝试合并这两个,但是这失败了,只是给出了所有178个观察结果的第一个数据集的输出。
第二个数据集( mav2
)确实产生了一个新列( row.names
),这是从初始数据集的每个观察值的行号。
如果有人可以帮助我将所有8个初始变量都纳入数据集只有55个独特的观察,我会非常感谢。感谢提前。
我想你想要的是重复
一个类似于独特
的函数返回重复元素的索引。
所以
mav2< - mav [!duplicateated(mav [,c(2,5,6)])]]
编辑:重复的反义意向
So, I imported a dataset with 178 observations and 8 variables. Then end goal was to eliminate all observations that were the same across three of those variables (2, 5, and 6). This proved quite easy using the unique command.
mav2 <- unique(mav[,c(2,5,6)])
The resulting mav2
dataframe produced 55 observations, getting rid of all the duplicates! Unfortunately, it also got rid of the other five variables that I did not use in the unique command (1,3,4,7, and 8). I initially tried adding the two dataframes, of course this did not work since they were of unequal size. I have also tried merging the two, but this fails and just gives the an output of the first dataset with all 178 observations.
The second dataset (mav2
) did produce a new column (row.names
) which is the row number for each observation from the initial dataset.
If anyone could help me out on getting all 8 initial variables into a dataset with only the 55 unique observations, I would be very appreciative. Thanks in advance.
I think what you want is duplicated
, a function similar to unique
that returns the indices of the duplicated elements.
So
mav2 <- mav[!duplicated(mav[,c(2,5,6)]),]
EDIT: inverted sense of duplicated
这篇关于在从数据集中删除重复项之后,无法保留所有变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!