如何在R中的数据帧中删除字符串末尾的一两个单词? [英] How to remove a word or two at the end of string in a dataframe in R?

查看：26 发布时间：2021/8/31 18:45:40 r gsub stringr

本文介绍了如何在R中的数据帧中删除字符串末尾的一两个单词?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个名为Country"的行的数据框.例如，当原产国为美国时，条目将列为路易斯安那州 - 美国".我试图去掉最后的-USA"，这样它只会说它来自哪个州.

I have a dataframe with a row called "Country". When the country of origin is the United States, the entries are listed as "Louisiana - USA", for example. I am trying to get rid of the "- USA" at the end, so that it will only say which state it came from.

所以，我目前有这样的东西(虽然我的有数千个条目):

So, I have something like this currently (though mine is thousands of entries):

df <- data.frame(ID = 1:4, Country = c("Louisiana - USA", "Canada","France", "Maine - USA"))

我尝试的是以下内容:

for (i in 1:nrow(df)) {
    df$USA[i] <- ifelse(grepl(" USA| États-Unis", df$Country[i]), 1, 0) 
}

index_USA <- which(df$USA == 1)

for (int in index_USA) {
    gsub(" - USA", "", df$Country[int])
}

但是，此代码不起作用.我还尝试使用 stringr 包而不是 gsub.因此，我将最后一个 for 循环替换为:

However, this code is not working. I also tried using the stringr package instead of gsub. So, I replaced the last for loop with:

for (int in index_USA) {
    str_replace_all(df$Country[int], " - USA", "")
}

但这也不起作用.我觉得我犯了一个明显的错误，但我无法弄清楚(也许我需要使用正则表达式?)

But this did not work either. I feel like I'm making an obvious mistake, but I cannot figure it out (perhaps I need to use regex?)

推荐答案

您要删除字符串末尾的 " USA" 和 " États-Unis".所以，你需要

You want to remove " USA" and " États-Unis" at the end of the string. So, you need

df$Country <- sub("\\s+(?:USA|États-Unis)$", "", df$Country)

详情

\\s+ - 1 个或多个空白字符
(?: - 一个(非捕获)分组结构的开始，匹配两个选项之一:
- USA - USA 子串
- | - 或
- États-Unis - États-Unis 子串
- \\s+ - 1 or more whitespace chars
- (?: - start of a (non-capturing) grouping construct, matching either of the two alternatives:
  - USA - USA substring
  - | - or
  - États-Unis - États-Unis substring
  这篇关于如何在R中的数据帧中删除字符串末尾的一两个单词?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在R中的数据帧中删除字符串末尾的一两个单词? [英] How to remove a word or two at the end of string in a dataframe in R?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在R中的数据帧中删除字符串末尾的一两个单词? [英] How to remove a word or two at the end of string in a dataframe in R?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭