根据字符串模式和ifelse的存在创建列 [英] Create column based on presence of string pattern and ifelse

查看:65
本文介绍了根据字符串模式和ifelse的存在创建列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果模式匹配,我想用两个值之一填充新列.

I would like to fill in a new column with one of two values if a pattern is matched.

这是我的数据框:

df <- structure(list(loc_01 = c("apis", "indu", "isro", "miss", "non_apis", 
"non_indu", "non_isro", "non_miss", "non_piro", "non_sacn", "non_slbe", 
"non_voya", "piro", "sacn", "slbe", "voya"), loc01_land = c(165730500, 
62101800, 540687600, 161140500, 1694590200, 1459707300, 1025051400, 
1419866100, 2037064500, 2204629200, 1918840500, 886299300, 264726000, 
321003900, 241292700, 530532000)), class = "data.frame", row.names = c(NA, 
-16L), .Names = c("loc_01", "loc01_land"))

看起来像这样...

     loc_01 loc01_land
1      apis  165730500
2      indu   62101800
3      isro  540687600
4      miss  161140500
5  non_apis 1694590200
6  non_indu 1459707300
7  non_isro 1025051400
8  non_miss 1419866100
9  non_piro 2037064500
10 non_sacn 2204629200
11 non_slbe 1918840500
12 non_voya  886299300
13     piro  264726000
14     sacn  321003900
15     slbe  241292700
16     voya  530532000

我想在df中添加一列,称为"loc_01".如果loc_01包含 non ,则返回'outside';如果loc_01不包含 non ,则返回'inside'.这是我的ifelse语句,但我遗漏了一些东西,因为它仅返回false值.

I would like to add a column to df, called 'loc_01'. If loc_01 contains non, then return 'outside', if it does not contain non, then return 'inside'. This is my ifelse statement, but I'm missing something because it only returns the false value.

df$loc01 <- ifelse(df$loc_01 == "non", 'outside', 'inside')

产生的df ...

     loc_01 loc01_land  loc01
1      apis  165730500 inside
2      indu   62101800 inside
3      isro  540687600 inside
4      miss  161140500 inside
5  non_apis 1694590200 inside
6  non_indu 1459707300 inside
7  non_isro 1025051400 inside
8  non_miss 1419866100 inside
9  non_piro 2037064500 inside
10 non_sacn 2204629200 inside
11 non_slbe 1918840500 inside
12 non_voya  886299300 inside
13     piro  264726000 inside
14     sacn  321003900 inside
15     slbe  241292700 inside
16     voya  530532000 inside

谢谢 -al

推荐答案

要检查字符串是否包含某个子字符串,不能使用==,因为它执行的是完全匹配(即仅当字符串恰好是"non"时才返回true.
您可以使用例如grepl函数(属于 grep系列功能),执行模式匹配:

To check if a string contains a certain substring, you can't use == because it performs an exact matching (i.e. returns true only if the string is exactly "non").
You could use for example grepl function (belonging to grep family of functions) that performs a pattern matching:

df$loc01 <- ifelse(grepl("non",df$loc_01),'outside','inside')

结果:

> df
     loc_01 loc01_land   loc01
1      apis  165730500  inside
2      indu   62101800  inside
3      isro  540687600  inside
4      miss  161140500  inside
5  non_apis 1694590200 outside
6  non_indu 1459707300 outside
7  non_isro 1025051400 outside
8  non_miss 1419866100 outside
9  non_piro 2037064500 outside
10 non_sacn 2204629200 outside
11 non_slbe 1918840500 outside
12 non_voya  886299300 outside
13     piro  264726000  inside
14     sacn  321003900  inside
15     slbe  241292700  inside
16     voya  530532000  inside

这篇关于根据字符串模式和ifelse的存在创建列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆