使用模式匹配在R数据帧中创建一个新列 [英] Create a new column in a R dataframe using pattern matching
问题描述
我正在尝试根据使用模式匹配的现有列创建一个新列。现有的列是一个用户代理字段,如
I am trying to create a new column based on a existing column that uses pattern matching. The existing column is a user agent field such as
Mozilla / 5.0(iPad; U; CPU OS 3_2,如Mac OS X; en-us)AppleWebKit / 531.21。 10(KHTML,像Gecko)Version / 4.0.4 Mobile / 7B367 Safari / 531.21.10
"Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B367 Safari/531.21.10"
我想创建一个使用模式匹配来识别什么的新列设备是。
I want to create a new column that uses pattern matching to identify what device is.
- 如果user_agent像'%iPad%'和user_agent像'%WebKit%',那么设备是iPad。
-if用户代理user_agent像'%Android%'和user_agent不像'%Mobile%',然后设备是一个android
- 如果(user_agent像'%Silk%'和user_agent像'%WebKit %')然后device is kindle
-if(user_agent like'%Playbook%')then device is Other
-So if user_agent like '%iPad%' and user_agent like '%WebKit%' then device is iPad. -if user agent user_agent like '%Android%' and user_agent not like '%Mobile%' then device is an android - if the (user_agent like '%Silk%' and user_agent like '%WebKit%') then device is kindle -if (user_agent like '%Playbook%') then device is Other
我想尝试使用mutate函数dplyr创建新的列,但需要帮助如何构造正则表达式
I want to try using the mutate function in dplyr to create the new column but need help with how to structure the regular expression
ie mutate(data,device = ....)
i.e mutate(data,device = ....)
推荐答案
这样的东西
x <- c("Mozilla/5.0 (iPad; stuff AppleWebKit more stuff",
"Android",
"stuff Silk more stuff and WebKit",
"stuff Playbook more stuff",
"unknown")
y <- ifelse(grepl("iPad", x) & grepl("WebKit", x), "iPad",
ifelse(grepl("Android", x) & !grepl("Mobile", x), "android",
ifelse(grepl("Silk", x) & grepl("WebKit", x), "kindle",
ifelse(grepl("Playbook", x), "other",
"don't know")
)
)
)
data.frame(x, y)
x y
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff iPad
2 Android android
3 stuff Silk more stuff and WebKit kindle
4 stuff Playbook more stuff other
5 unknown don't know
编辑
或者也许更容易:
device <- rep(NA_character_, length(x))
device[grepl("iPad", x) & grepl("WebKit", x)] <- "iPad"
device[grepl("Android", x) & !grepl("Mobile", x)] <- "android"
device[grepl("Silk", x) & grepl("WebKit", x)] <- "kindle"
device[grepl("Playbook", x)] <- "other"
data.frame(x, device)
x device
1 Mozilla/5.0 (iPad; stuff AppleWebKit more stuff iPad
2 Android android
3 stuff Silk more stuff and WebKit kindle
4 stuff Playbook more stuff other
5 unknown <NA>
这篇关于使用模式匹配在R数据帧中创建一个新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!