在R中从右到左按分隔符拆分列 [英] Splitting column by separator from right to left in R

查看:30
本文介绍了在R中从右到左按分隔符拆分列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个数据集,其中一列 (Place) 包含一个位置语句.

I'm working on a dataset where one column (Place) consists of a location sentence.

librabry(tidyverse)  

example <- tibble(Datum = c("October 1st 2017", 
                            "October 2st 2017",
                            "October 3rd 2017"),
             Place = c("Tabiyyah Jazeera village, 20km south east of Deir Ezzor, Deir Ezzor Governorate, Syria",
                       "Abu Kamal, Deir Ezzor Governorate, Syria",
                       "شارع القطار al Qitar [train] street, al-Tawassiya area, north of Raqqah city centre, Raqqah governorate, Syria"))

我想用逗号分隔符分割 Place 列,所以我更喜欢使用 tidyverse 包 的解决方案.因为 Place 的值有不同的长度,我想从右到左开始.因此国家 Syria 是此数据框最后一列中的值.

I would like to split the Place column by the comma separator so I prefer a solution with the tidyverse package. Because the values of Place have different lengths I would like to start from right to left. So that the country Syria is the value in the last column of this dataframe.

哦,我应该用哪个 RegEx 代码删除阿拉伯字符?

Oh, and for a bonus with which RegEx code do I delete the Arabic characters?

提前致谢.

找到我的答案:删除阿拉伯字符(感谢@g5w):

Found my answer: For removing Arabic characters (thanks to @g5w):

gsub("[\u0600-\u06FF]", "", airstrikes_okt_clean$Plek)

并以整洁的方式拆分列:

And splitting the column in a tidyr way:

airstrikes_okt_clean <- separate(example, 
                             Place, 
                             into = c("detail", 
                                      "detail2", 
                                      "City_or_village", 
                                      "District", 
                                      "Country"), 
                             sep = ",", 
                             fill = "left") 

推荐答案

只需将字符串以逗号分隔并反转即可.

Just split the string on comma and the reverse it.

 lapply(strsplit(Place, ","), rev)
[[1]]
[1] " Syria"                         " Deir Ezzor Governorate"       
[3] " 20km south east of Deir Ezzor" "Tabiyyah Jazeera village"      

[[2]]
[1] " Syria"                  " Deir Ezzor Governorate"
[3] "Abu Kamal"              

[[3]]
[1] " Syria"                              " Raqqah governorate"                
[3] " north of Raqqah city centre"        " al-Tawassiya area"                 
[5] "شارع القطار al Qitar [train] street"

要在拆分前去掉阿拉伯字符,请尝试

To get rid of the Arabic characters before splitting, try

gsub("[\u0600-\u06FF]", "", Place)
[1] "Tabiyyah Jazeera village, 20km south east of Deir Ezzor, Deir Ezzor Governorate, Syria"              
[2] "Abu Kamal, Deir Ezzor Governorate, Syria"                                                            
[3] "  al Qitar [train] street, al-Tawassiya area, north of Raqqah city centre, Raqqah governorate, Syria"

这篇关于在R中从右到左按分隔符拆分列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆