使用模式匹配在R数据帧中创建一个新列 [英] Create a new column in a R dataframe using pattern matching

查看:157
本文介绍了使用模式匹配在R数据帧中创建一个新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据使用模式匹配的现有列创建一个新列。现有的列是一个用户代理字段,如

I am trying to create a new column based on a existing column that uses pattern matching. The existing column is a user agent field such as

Mozilla / 5.0(iPad; U; CPU OS 3_2,如Mac OS X; en-us)AppleWebKit / 531.21。 10(KHTML,像Gecko)Version / 4.0.4 Mobile / 7B367 Safari / 531.21.10

"Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B367 Safari/531.21.10"

我想创建一个使用模式匹配来识别什么的新列设备是。

I want to create a new column that uses pattern matching to identify what device is.

- 如果user_agent像'%iPad%'和user_agent像'%WebKit%',那么设备是iPad。
-if用户代理user_agent像'%Android%'和user_agent不像'%Mobile%',然后设备是一个android
- 如果(user_agent像'%Silk%'和user_agent像'%WebKit %')然后device is kindle
-if(user_agent like'%Playbook%')then device is Other

-So if user_agent like '%iPad%' and user_agent like '%WebKit%' then device is iPad. -if user agent user_agent like '%Android%' and user_agent not like '%Mobile%' then device is an android - if the (user_agent like '%Silk%' and user_agent like '%WebKit%') then device is kindle -if (user_agent like '%Playbook%') then device is Other

我想尝试使用mutate函数dplyr创建新的列,但需要帮助如何构造正则表达式

I want to try using the mutate function in dplyr to create the new column but need help with how to structure the regular expression

ie mutate(data,device = ....)

i.e mutate(data,device = ....)

推荐答案

这样的东西

x <- c("Mozilla/5.0 (iPad; stuff AppleWebKit more stuff",
        "Android",
        "stuff Silk more stuff and WebKit",
        "stuff Playbook more stuff", 
        "unknown")

y <- ifelse(grepl("iPad", x) & grepl("WebKit", x), "iPad", 
        ifelse(grepl("Android", x) & !grepl("Mobile", x), "android", 
                ifelse(grepl("Silk", x) & grepl("WebKit", x), "kindle", 
                        ifelse(grepl("Playbook", x), "other", 
                                "don't know")
                )
        )
)

data.frame(x, y)
                                                x          y
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff       iPad
2                                         Android    android
3                stuff Silk more stuff and WebKit     kindle
4                       stuff Playbook more stuff      other
5                                         unknown don't know

编辑

或者也许更容易:

device <- rep(NA_character_, length(x))

device[grepl("iPad", x) & grepl("WebKit", x)] <-  "iPad"
device[grepl("Android", x) & !grepl("Mobile", x)] <-  "android"
device[grepl("Silk", x) & grepl("WebKit", x)] <-  "kindle"
device[grepl("Playbook", x)] <-  "other"

data.frame(x, device)

                                                x  device
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff    iPad
2                                         Android android
3                stuff Silk more stuff and WebKit  kindle
4                       stuff Playbook more stuff   other
5                                         unknown    <NA>

这篇关于使用模式匹配在R数据帧中创建一个新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆