在R中使用gsub删除邮政编码字段中的值 [英] Using gsub in R to remove values in Zip Code field

查看:49
本文介绍了在R中使用gsub删除邮政编码字段中的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含值的列,其中之一是美国邮政编码.

I have a data frame that contains columns of values, one of which is United States Postal Zip codes.

    Row_num Restaurant Address             City     State Zip 
    26698   m          1460 Memorial Drive Chicopee MA    01020-3964

对于此条目,我只希望使用5位邮政编码01020,并删除其后的"-3964",并对数据框中的每个条目进行此操作.现在,r将邮政编码列视为chr.

For this entry, I want to only have the 5 digit zip code 01020 and remove the "-3964" after it and do this for every entry in my data frame. Right now the zip code column is being treated as a chr by r.

我尝试了以下gsub代码:

I have tried the following gsub code:

df$Zip <- gsub(df$Zip, pattern="-[0,9]{0,4}", replacement = "")

但是,所有要做的就是用空格替换-".这不仅不是我想要的,而且也不是我期望的,因此,对gsub的行为方式以及如何获得所需结果的任何帮助,我们将不胜感激.

However, all that does is replace the "-" with no space. Not only is that not what I want but it is also not what I expected so any help as to how gsub behaves and how to get the desired result would be appreciated.

谢谢!

通过反复试验,我发现该代码块也可以正常工作

I have found out through trial and error that this block of code works as well

df$Zip <- gsub(df$Zip, pattern="-.*", replacement = "")

推荐答案

您定义的字符类只有三个元素0、9和,".在字符类括号内,您需要使用破折号作为范围运算符,因此请尝试:

The character class you defined has only three elements 0, 9, and ",". Inside character class brackets you need to use dash as the range operator, so try:

df$Zip <- gsub(df$Zip, pattern="-[0-9]{0,4}", replacement = "")

这篇关于在R中使用gsub删除邮政编码字段中的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆