删除R中数据集的重复列字符 [英] Removing duplicated column characters of dataset in r

查看:586
本文介绍了删除R中数据集的重复列字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是r的新手,删除重复字符时遇到问题.

I am new to r and I have problems with removing duplicated characters.

这是我的代码:

library(RCurl)
x <- getURL("https://raw.githubusercontent.com/eparker12/nCoV_tracker/master/input_data/coronavirus.csv")
y <- read.csv(text = x)
z <- duplicated(y$jhuID)

我尝试了类似z<-...的方法,但是没有用. 对于数据框中的列jhuID,它是类character,但是有很多国家/地区名称重复了多次,而我的目标是删除那些重复的国家/地区名称,并确保只将同一名称保留一次.类别character

I tried something like z <- ... but it did not work. For the column jhuID in the dataframe it is the class character but there are many name of countries that repeat multiple times and my goal is to delete those duplicated name of country and make sure that it remain only one time with the same class character

例如,如果我通过y$jhuID查看数据,则会看到多次出现的国家/地区的所有名称.当我查看z$jhulD时,我想要新的数据框,例如z,我会看到国家名称每次只出现一次.

For example if I view data by y$jhuID, I will see all the names of the country that appear multiple time. I want new dataframe for example z when I view z$jhulD I will see the name of country appear only one time each.

对此将提供任何帮助!!预先感谢

Any help for this would be much appreciated!! Thanks in advance

推荐答案

带有h distinctarrange

library(dplyr)
y %>%
     distinct(jhu_ID, .keep_all = TRUE) %>%
     arrange(jhu_ID)

这篇关于删除R中数据集的重复列字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆