删除R中数据集的重复列字符 [英] Removing duplicated column characters of dataset in r
问题描述
我是r的新手,删除重复字符时遇到问题.
I am new to r and I have problems with removing duplicated characters.
这是我的代码:
library(RCurl)
x <- getURL("https://raw.githubusercontent.com/eparker12/nCoV_tracker/master/input_data/coronavirus.csv")
y <- read.csv(text = x)
z <- duplicated(y$jhuID)
我尝试了类似z<-...的方法,但是没有用.
对于数据框中的列jhuID
,它是类character
,但是有很多国家/地区名称重复了多次,而我的目标是删除那些重复的国家/地区名称,并确保只将同一名称保留一次.类别character
I tried something like z <- ... but it did not work.
For the column jhuID
in the dataframe it is the class character
but there are many name of countries that repeat multiple times and my goal is to delete those duplicated name of country and make sure that it remain only one time with the same class character
例如,如果我通过y$jhuID
查看数据,则会看到多次出现的国家/地区的所有名称.当我查看z$jhulD
时,我想要新的数据框,例如z
,我会看到国家名称每次只出现一次.
For example if I view data by y$jhuID
, I will see all the names of the country that appear multiple time. I want new dataframe for example z
when I view z$jhulD
I will see the name of country appear only one time each.
对此将提供任何帮助!!预先感谢
Any help for this would be much appreciated!! Thanks in advance
推荐答案
带有h distinct
和arrange
library(dplyr)
y %>%
distinct(jhu_ID, .keep_all = TRUE) %>%
arrange(jhu_ID)
这篇关于删除R中数据集的重复列字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!