有条件地删除R中的向量元素的字符 [英] Conditionally Remove Character of a Vector Element in R

查看：569 发布时间：2016/11/18 16:25:01 regex r string character

本文介绍了有条件地删除R中的向量元素的字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有（如有不完整的）地址资料，如下所示：

  data<  -  c（1600 Pennsylvania Avenue，Washington DC，
，Siem Reap，FC，11 Wall Street，New York，NY，，Addis Ababa，FC））
  
 
 
 如果其中一个是逗号，我需要删除第一个和/或最后一个字符。
 
 
 到目前为止，我有：
  for（i in 1：length（data））{
 lastchar <  -  nchar（data [i]）
 sec2last<  -  nchar（data [i]） -  1 
 if（regexpr（，，data [i]）[1] == 1 ）{
 data [i] } 
 if（regexpr（，，data [i]）[1] = nchar（data [i]））{
 data [i] } 
} 
 
数据
  
它适用于第一个字符，但不是最后一个字符。如何修改第二个 if 语句或以其他方式完成我的目标？
解决方案
 p>您可以尝试以下代码删除在开始或结束处出现的逗号。
 >数据< -  c（1600 Pennsylvania Avenue，Washington DC，
 +，Siem Reap，FC，11 Wall Street，New York，NY，，Addis Ababa， b $ b> 
 [1]1600 Pennsylvania Avenue，Washington DC
 [2]（1）
 [1] ]暹粒，FC
 [3]11华尔街，纽约，纽约
 [4]亚的斯亚贝巴，FC
  pre> 
 
  模式说明：
 
 
  
  （？<= ^），在regex （？<=）称为正后备。在我们的例子中，它断言什么前面的逗号必须是一个行开始 ^ 。 
 
   | 逻辑OR运算符通常用于合并（即ORing）两个正则表达式。
 
  ，（？= $） Lookahead要求逗号后面的行必须是行尾 $ 。因此，它匹配行末尾的逗号。
 
 
 
I have (sometimes incomplete) data on addresses that looks like this:
data <- c("1600 Pennsylvania Avenue, Washington DC", 
          ",Siem Reap,FC,", "11 Wall Street, New York, NY", ",Addis Ababa,FC,")  
I need to remove the first and/or last character if either one of them are a comma.

So far, I have:
for(i in 1:length(data)){
  lastchar <- nchar(data[i])
  sec2last <- nchar(data[i]) - 1
  if(regexpr(",",data[i])[1] == 1){
    data[i] <- substr(data[i],2, lastchar)
  }
  if(regexpr(",",data[i])[1] == nchar(data[i])){
    data[i] <- substr(data[i],1, sec2last)
  }
}

data
which works for the first character, but not the last character. How can I modify the second if statement or otherwise accomplish my goal?
 解决方案 
You could try the below code which remove the comma present at the start or at the end,
> data <- c("1600 Pennsylvania Avenue, Washington DC", 
+           ",Siem Reap,FC,", "11 Wall Street, New York, NY", ",Addis Ababa,FC,")
> gsub("(?<=^),|,(?=$)", "", data, perl=TRUE)
[1] "1600 Pennsylvania Avenue, Washington DC"
[2] "Siem Reap,FC"                           
[3] "11 Wall Street, New York, NY"           
[4] "Addis Ababa,FC" 
Pattern explanation:


(?<=^), In regex (?<=) called positive look-behind. In our case it asserts What precedes the comma must be a line start ^. So it matches the starting comma.
| Logical OR operator usually used to combine(ie, ORing) two regexes.
,(?=$) Lookahead aseerts that what follows comma must be a line end $. So it matches the comma present at the line end.


                        这篇关于有条件地删除R中的向量元素的字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

有条件地删除R中的向量元素的字符 [英] Conditionally Remove Character of a Vector Element in R

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

有条件地删除R中的向量元素的字符 [英] Conditionally Remove Character of a Vector Element in R

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭